-
Notifications
You must be signed in to change notification settings - Fork 50
Description
What is the path in the ChunkManifest for a inlined variable? Does it start with
memory://
?
No-one's ever made one, because we never fixed #489 😅 But I think we should, and I think memory://
would make sense as a prefix for that. To make that work I guess we would need the kerchunk parser to know that if it finds inlined data it should put that data into a MemoryStore
and then create a chunk reference that refers to it?
that would work as a prefix for the
ObjectStoreRegistry
That idea works in the sense that if I manually create and pass ObjectStoreRegistry({"memory://": memory_store})
then I can have the memory_store
and optionally additional stores for actually getting referenced chunks.
But it doesn't work in the sense that if I do
memory_store = obstore.store.MemoryStore()
parser = KerchunkJSONParser(store_registry=None)
parser("refs.json", memory_store)
then the get_store_prefix
call occurs before we get anywhere near creating a chunkmanifest, instead happening at a point that causes get_store_prefix
to (incorrectly) have to guess which prefix to use.
I think the better solution to that would be to move the fs_root
logic to be earlier in the parser, so that the local filepaths for chunks in the kerchunk references are disambiguated before get_store_prefix
is called. This is consistent with what I said above - if the kerchunk parser finds inlined data it should create a chunk reference with a memory://
prefix, if it finds an ambiguous path like data.nc
it should prepend it with fs_root
. Then get_store_prefix
will have enough information to work with.
But now I have a working example so I'm tempted to punt on that follow-up.
Originally posted by @TomNicholas in #631 (comment)