-
-
Notifications
You must be signed in to change notification settings - Fork 356
implement .chunks on v3 arrays #1929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
9d3bdc5
355d24f
cfee2e8
ab79442
4faba8f
28363fa
9d946ac
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -343,6 +343,15 @@ def ndim(self) -> int: | |
def shape(self) -> ChunkCoords: | ||
return self.metadata.shape | ||
|
||
@property | ||
def chunks(self) -> ChunkCoords: | ||
try: | ||
return self.metadata.chunk_grid.chunk_shape # type: ignore[attr-defined] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am fighting with mypy here. My local pre-commit env does not like
(says but that seems to be what the CI-based pre-commit linter wants There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I took a swing at fixing this @rabernat. Let me know what you think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks Joe! I like it! |
||
except AttributeError as err: # pragma: no cover | ||
raise AttributeError( | ||
f"Chunk grid {self.metadata.chunk_grid} array does not have a chunk shape." | ||
) from err | ||
|
||
@property | ||
def size(self) -> int: | ||
return np.prod(self.metadata.shape).item() | ||
|
@@ -558,6 +567,10 @@ def ndim(self) -> int: | |
def shape(self) -> ChunkCoords: | ||
return self._async_array.shape | ||
|
||
@property | ||
def chunks(self) -> ChunkCoords: | ||
return self._async_array.chunks | ||
|
||
@property | ||
def size(self) -> int: | ||
return self._async_array.size | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a deprecation warning here? I think users should instead look at the chunk grid directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the chunk grid object is guaranteed to have a consistent structure, beyond
{name: str, configuration: dict[str, JSON]}
? We might instead want users to look at a normalized (explicit) representation of the chunks, i.e.tuple[tuple[int, ...], ...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinion is that having a backwards compatible
.chunks
attribute is very important. Deprecating this or changing its behavior will have far-reaching consequences for libraries that depend on Zarr. Given that we only support regular chunk grids so far, this solution feels very reasonable to me. Once other chunk grids are developed this will raise an attribute error, and the user code can handle it some other way (i.e. looking at the chunk grid directly).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rabernat I think your reasoning makes sense, given that irregular chunks are prospective at this point.
Thinking about future changes to support irregular chunks, do you think it would make sense to transform the type of
chunks
totuple[tuple[int, ...], ...]
(matching thechunks
attribute of a dask array), assuming we give a long deprecation warning?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, we can revisit this once we have other chunk grids.
How would that scale if you have billions of chunks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, we would probably need to switch to a generator of tuple of ints. But I think someone with billions of chunks will have lots of other problems they run into first.