Skip to content

document dtype extension #3157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 53 commits into from
Jul 3, 2025
Merged

document dtype extension #3157

merged 53 commits into from
Jul 3, 2025

Conversation

d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Jun 19, 2025

This PR adds a working example of custom dtype creation and registration. because it's a lot of code, I put this in a new top-level directory called examples, which contains the executable python file dtype_example.py. This file uses PEP-723 metadata to declare a ml_dtypes dependency, and it uses a local zarr install, which means it can be tested properly against local changes.

I also expanded the current dtype docs in the user guide to include content about the data type resolution process.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.rst
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jun 19, 2025
@d-v-b
Copy link
Contributor Author

d-v-b commented Jun 19, 2025

cc @nenb @ianhi, since yall were the most involved in this example over in the main dtypes PR.

@dstansby dstansby added this to the 3.1.0 milestone Jun 20, 2025
Copy link
Contributor

@ianhi ianhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is a great improvment. I left some comments and suggested improvements. The other thing I'd wish for is to the format the lines to to 80 characters long. In general I like 100 line length, but when rendered on the docs page you as it currently stands you have to horizontally scroll to read the example code.


class Int2(ZDType[int2_dtype_cls, int2_scalar_cls]):
"""
This class provides a Zarr compatibility layer around the int2 data type and the int2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a nice link explaining the difference between these? I think I've inferred it but would be nice to make it explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no I don't actually think there is a nice link that explains the data type / scalar type difference. The numpy docs should explain this, but they don't. I can add something to our docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put something about this in the data type guide


def to_json_scalar(self, data: object, *, zarr_format: ZarrFormat) -> int:
"""Convert a python object to a scalar."""
return int(data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be more specific to the example? e.g. explain something to the effect of "needs to be int to be compatible with json." and mention int2 somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added something to this effect

@github-actions github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Jun 23, 2025
@d-v-b d-v-b force-pushed the docs/dtype-docs branch from 89064f0 to 45aab29 Compare June 24, 2025 13:41
@d-v-b
Copy link
Contributor Author

d-v-b commented Jun 24, 2025

all the sphinx issues are sorted, the only failing check is coverage, and I'm not sure I believe the claims that these changes reduced test coverage by 20%.

@maxrjones
Copy link
Member

@d-v-b FYI I think that some of these missing features in 3.0 in the array docs can now instead be linked to the new dtypes docs - https://zarr.readthedocs.io/en/stable/user-guide/arrays.html#missing-features-in-3-0.

@d-v-b
Copy link
Contributor Author

d-v-b commented Jul 2, 2025

@d-v-b FYI I think that some of these missing features in 3.0 in the array docs can now instead be linked to the new dtypes docs - https://zarr.readthedocs.io/en/stable/user-guide/arrays.html#missing-features-in-3-0.

I removed the "this stuff is missing" sections, leaving only the part that is still missing (the copying + migrating data stuff)

@d-v-b d-v-b merged commit a5dcf42 into zarr-developers:main Jul 3, 2025
65 of 70 checks passed
@d-v-b d-v-b mentioned this pull request Jul 3, 2025
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants