@@ -138,3 +138,53 @@ To help users keep things straight, please `let us know
138
138
<https://github.com/pydata/xarray/issues> `_ if you plan to write a new accessor
139
139
for an open source library. In the future, we will maintain a list of accessors
140
140
and the libraries that implement them on this page.
141
+
142
+ .. _zarr_encoding :
143
+
144
+ Zarr Encoding Specification
145
+ ---------------------------
146
+
147
+ In implementing support for the `Zarr <https://zarr.readthedocs.io/ >`_ storage
148
+ format, Xarray developers made some *ad hoc * choices about how to store
149
+ NetCDF data in Zarr.
150
+ Future versions of the Zarr spec will likely include a more formal convention
151
+ for the storage of the NetCDF data model in Zarr; see
152
+ `Zarr spec repo <https://github.com/zarr-developers/zarr-specs >`_ for ongoing
153
+ discussion.
154
+
155
+ First, Xarray can only read and write Zarr groups. There is currently no support
156
+ for reading / writting individual Zarr arrays. Zarr groups are mapped to
157
+ Xarray ``Dataset `` objects.
158
+
159
+ Second, from Xarray's point of view, the key difference between
160
+ NetCDF and Zarr is that all NetCDF arrays have *dimension names * while Zarr
161
+ arrays do not. Therefore, in order to store NetCDF data in Zarr, Xarray must
162
+ somehow encode and decode the name of each array's dimensions.
163
+
164
+ To accomplish this, Xarray developers decided to define a special Zarr array
165
+ attribute: ``_ARRAY_DIMENSIONS ``. The value of this attribute is a list of
166
+ dimension names (strings), for example ``["time", "lon", "lat"] ``. When writing
167
+ data to Zarr, Xarray sets this attribute on all variables based on the variable
168
+ dimensions. When reading a Zarr group, Xarray looks for this attribute on all
169
+ arrays, raising an error if it can't be found. The attribute is used to define
170
+ the variable dimension names and then removed from the attributes dictionary
171
+ returned to the user.
172
+
173
+ Because of these choices, Xarray cannot read arbitrary array data, but only
174
+ Zarr data with valid ``_ARRAY_DIMENSIONS `` attributes on each array.
175
+
176
+ After decoding the ``_ARRAY_DIMENSIONS `` attribute and assigning the variable
177
+ dimensions, Xarray proceeds to [optionally] decode each variable using its
178
+ standard CF decoding machinery used for NetCDF data (see :py:func: `decode_cf `).
179
+
180
+ As a concrete example, here we write a tutorial dataset to Zarr and then
181
+ re-open it directly with Zarr:
182
+
183
+ .. ipython :: python
184
+
185
+ ds = xr.tutorial.load_dataset(' rasm' )
186
+ ds.to_zarr(' rasm.zarr' , mode = ' w' )
187
+ import zarr
188
+ zgroup = zarr.open(' rasm.zarr' )
189
+ print (zgroup.tree())
190
+ dict (zgroup[' Tair' ].attrs)
0 commit comments