|
55 | 55 | "cell_type": "markdown",
|
56 | 56 | "metadata": {},
|
57 | 57 | "source": [
|
58 |
| - "- Tree of arbitrary groups\n", |
59 |
| - "- Each holds arbitrary data in the form of arrays + metadata\n", |
60 |
| - "- No relationship enforced between groups\n", |
61 |
| - "- No relationship enforced between arrays within a group\n", |
62 |
| - "- No concept of \"coordinates\" vs \"data\"\n", |
63 |
| - "- No references from one group to another" |
| 58 | + "* **Tree of groups** – Tree of arbitrary groups.\n", |
| 59 | + "\n", |
| 60 | + "* **Separate groups** – No relationship enforced between groups, and no references from one group to another.\n", |
| 61 | + "\n", |
| 62 | + "* **Separate arrays** – No relationship enforced between arrays within a group.\n", |
| 63 | + "\n", |
| 64 | + "* **Arbitrary JSON metadata** – Each holds arbitrary data in the form of arrays + metadata." |
64 | 65 | ]
|
65 | 66 | },
|
66 | 67 | {
|
|
79 | 80 | "cell_type": "markdown",
|
80 | 81 | "metadata": {},
|
81 | 82 | "source": [
|
82 |
| - "How does zarr relate to `xarray`?\n", |
| 83 | + "### How does zarr relate to `xarray`?\n", |
| 84 | + "\n", |
| 85 | + "* **Arrays <-> `Variables`** - zarr arrays map well to `xarray.Variables`\n", |
| 86 | + " - Especially as zarr v3 includes (optional) `dimension_names`\n", |
| 87 | + "\n", |
| 88 | + "* **Groups <-> `Datasets`** - zarr groups map reasonably well to `xarray.Dataset` objects\n", |
| 89 | + " - Open a single zarr group in xarray via `xr.open_dataset(store, group='/path', engine='zarr')`\n", |
83 | 90 | "\n",
|
84 |
| - "- zarr arrays map well to `xarray.Variables`\n", |
85 |
| - " - especially because zarr v3 includes (optional) `dimension_names`\n", |
86 |
| - "- zarr groups map reasonably well to `xarray.Dataset` objects\n", |
87 |
| - " - `xr.open_dataset(store, group='/path', engine='zarr')`\n", |
88 |
| - " - but `xarray.Dataset`s require that all arrays in the Dataset have aligned dimensions\n", |
89 |
| - " - so it is possible to create a zarr group that is not a valid `xarray.Dataset`, if the group contains arrays with non-aligning dimensions\n", |
90 |
| - " - Also zarr has no concept of \"coordinate\" vs \"data\" variables\n", |
91 |
| - " - so xarray has to save this piece of information as an additional piece of metadata \n", |
92 |
| - "- zarr store has a tree of groups\n", |
| 91 | + "* **Groups must be alignable** - But `xarray.Dataset`s require that all arrays in the Dataset have aligned dimensions\n", |
| 92 | + " - so it is possible to create a zarr group that is not a valid `xarray.Dataset`, if the group contains arrays with non-aligning dimensions\n", |
| 93 | + "\n", |
| 94 | + "* **No \"coordinates\"** – No arrays are special, so Zarr has no intrinsic concept of \"coordinate\" vs \"data\" variables.\n", |
| 95 | + " - So xarray has to save this piece of information as an additional piece of zarr metadata.\n", |
| 96 | + "\n", |
| 97 | + "* **Tree of groups <-> `DataTree`** - zarr store has a tree of groups\n", |
93 | 98 | " - maps to either a set of independent `xarray.Datasets`\n",
|
94 | 99 | " - `xr.open_groups(store)`\n",
|
95 | 100 | " - or to a single `xarray.DataTree`\n",
|
|
192 | 197 | "cell_type": "markdown",
|
193 | 198 | "metadata": {},
|
194 | 199 | "source": [
|
195 |
| - "TIFF (Tag Image File Format) is a *flexible* raster container widely used in biosciences, remote sensing and GIS. \n", |
| 200 | + "TIFF (Tag Image File Format) is a raster container widely used in biosciences, remote sensing and GIS. \n", |
196 | 201 | "\n",
|
197 | 202 | "A **GeoTIFF** is simply a TIFF that stores additional additional georeferencing information tags (CRS, affine transform, etc.) so geospatial software knows where each pixel sits on Earth. \n",
|
198 | 203 | "\n",
|
|
206 | 211 | "\n",
|
207 | 212 | "* **Compression / tiling** – DEFLATE, LZW, etc. Tiling lets software fetch small windows efficiently.\n",
|
208 | 213 | "\n",
|
209 |
| - "### Practical notes for xarray users\n", |
210 |
| - "\n", |
211 |
| - "* **Read** – use `rioxarray.open_rasterio()` (wraps rasterio) to get an immediate, Dask-chunked DataArray.\n", |
212 |
| - "\n", |
213 |
| - "* **Write** – `DataArray.rio.to_raster(\"out.tif\")`; choose compression + tiling via driver_kwargs.\n", |
214 |
| - "\n", |
215 |
| - "* **Dimensionality** – TIFF is inherently 2-D per band; no native time or vertical axis. If you need 4-D data, NetCDF or Zarr is usually a better fit.\n", |
216 |
| - "\n", |
217 |
| - "* **Metadata depth** – single-level tags only (no nested groups). For rich hierarchies, stick to HDF5 / NetCDF-4.\n", |
218 |
| - "\n", |
219 |
| - "* **Cloud-optimized GeoTIFF (COG)** – same format, arranged so HTTP range requests can stream windows efficiently; xarray handles it transparently when rasterio is compiled with libcurl.\n" |
| 214 | + "* **Cloud-optimized GeoTIFF (COG)** – same format, arranged so HTTP range requests can stream windows efficiently; xarray handles it transparently when rasterio is compiled with libcurl." |
220 | 215 | ]
|
221 | 216 | },
|
222 | 217 | {
|
223 | 218 | "cell_type": "markdown",
|
224 | 219 | "metadata": {},
|
225 | 220 | "source": [
|
226 | 221 | "### How does TIFF relate to xarray?\n",
|
227 |
| - "\n" |
| 222 | + "\n", |
| 223 | + "* **Dimensionality** – Each raster image maps well to a single `xarray.Variable`, but TIFF is inherently 2-D per band; no native time or vertical axis. If you need 4-D data, NetCDF or Zarr is usually a better fit.\n", |
| 224 | + "\n", |
| 225 | + "* **No named dimensions** - TIFFs don't have named dimensions for the two axes of the raster.\n", |
| 226 | + "\n", |
| 227 | + "* **IFDs as groups** - IFDs can be mapped to groups, which may be useful for multi-resolution TIFFs (also known as \"overviews\") and multi-page TIFFs.\n", |
| 228 | + "\n", |
| 229 | + "* **Metadata depth** – single-level tags only (no nested groups). For rich hierarchies, stick to HDF5 / NetCDF-4.\n", |
| 230 | + "\n", |
| 231 | + "* **Read** – use `rioxarray.open_rasterio()` (wraps rasterio) to get an immediate, Dask-chunked DataArray. However `rioxarray` is for interacting with GeoTIFFs, not general TIFFs.\n", |
| 232 | + "\n", |
| 233 | + "* **Write** – `DataArray.rio.to_raster(\"out.tif\")`; choose compression + tiling via driver_kwargs." |
228 | 234 | ]
|
229 | 235 | },
|
230 | 236 | {
|
|
0 commit comments