Skip to content

Multiple/alternative data retrieval implementations #98

@achingbrain

Description

@achingbrain

Currently Helia takes a blockstore that it enhances with bitswap. This creates a hard dependency on bitswap.

To enable experimentation and adoption of faster/more use-case specific retrieval protocols (cars, graphsync, XYZNewFutureProtocol etc) we should allow this to be a configuration option.

At this point blocks may not be the correct abstraction since it limits us to a block as the unit of data you get in response to a CID.

A better read abstraction might be a CID to a stream of Uint8Arrays? The the underlying retrieval method can apply whatever optimisations it can to fetch the data quickly and the calling code doesn't have to keep going back to fetch another block for another CID.

interface Options {
  offset?: number
  length?: number
}

interface ContentReader {
  get (cid: CID, options: Options): AsyncGenerator<Uint8Array>
}

Questions:

  • Does this shift complexity of interpreting block data on to the content reader?
  • What does the writer interface look like?
    • Can the writer/reader interfaces be asymmetric? E.g. CIDs/Blocks in, CID/Stream out?
  • Does this assume file data?
  • What about structures like unixfs where the root block has file metadata and then file data in leaf nodes?
  • If DAGs are all dag-pb, dag-cbor or dag-json we can make some assumptions about structure?

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/architectureCore architecture of projectkind/discussionTopical discussion; usually not changes to codebase

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions