Build a collections adaptor #758

josephjclark · 2024-09-25T07:30:53Z

Overview

Create a Collections adaptor that speaks to the Lightning Collections API.

It should probably be suffixed by the backend data store, like collections-postgres or collections-lightning or something, so that later it's easier to introduce new collection types, like collections-redis.

The collections API is a bit unusual in that it will be loaded as a second adaptor to jobs. It's a second class citizen in the job code. So everything needs to be namespaced - collections and operations and stuff. So collections.get rather than get. Otherwise there's a risk for clashing with the main adaptor namespace.

API

TBD

Maybe something like:

collection.get(name, keys) // get a list of keys from a collection
collection.set(name, items, keyFn) // set a bunch of items in a collection
collection.find(name, queryObj) // some kind of query API
collection.each(name, keysOrQuery, callback) // built in iterator might be nice?

Presumably the getters just write to state.data

Is a collection name totally arbitrary? Or is it bound to the jwt and project somehow? Does a project need multiple collections, or do we just store a collection per project which is shared by its workflows? How is the collection managed? When do items expire, when is it reset, how do we know how much stuff is in there?

Configuration

{
  collections_key: /* a JWT */
  collections_endpoint: /* hard code this for now? */
}

Note that config will be set by the worker automatically. Maybe later Lightning will take more control.

Query

I'm not sure what to do about the query API yet, it'll depend on the support offered by the actual collections service.

The text was updated successfully, but these errors were encountered:

josephjclark · 2024-09-26T09:51:37Z

Stu: we should go streaming first on this! The adaptor uses an async iterator with a stream under the hood (but passes full objects to the callback). It should also decode on the fly for get etc

josephjclark · 2024-09-26T09:57:24Z

Query: should it just be time series? So you get a key by id, or you get keys between two dates (or before/after one date)

Maybe allow key scanning? Pass a pattern and we'll find keys which match that name

josephjclark mentioned this issue Sep 25, 2024

Data stores, Collections, and Buffers OpenFn/lightning#2190

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build a collections adaptor #758

Build a collections adaptor #758

josephjclark commented Sep 25, 2024

josephjclark commented Sep 26, 2024

josephjclark commented Sep 26, 2024 •

edited

Loading

Build a collections adaptor #758

Build a collections adaptor #758

Comments

josephjclark commented Sep 25, 2024

Overview

API

Configuration

Query

josephjclark commented Sep 26, 2024

josephjclark commented Sep 26, 2024 • edited Loading

josephjclark commented Sep 26, 2024 •

edited

Loading