Description
xr.Dataset.to_netcdf
does not seem to support writing data with unsigned integer dtypes, uint32
, uint64
etc.. This seems to be the case for both scipy-based output formats, NETCDF3_CLASSIC
and NETCDF3_64BIT
.
It seems like dtype coercions for int64
and bool
are done automatically for NetCDF3 in the xarray.netcdf3
module.
Shouldn't data of signed dtypes then also be coerced, i.e. to their signed equivalent?
MCVE Code Sample
import numpy as np
import xarray as xr
da = xr.DataArray(np.array([1,2,3], dtype='uint64'))
# The following all fail:
da.to_netcdf("foo") # default format: scipy NETCDF3_CLASSIC
da.to_netcdf("bar", format='NETCDF3_64BIT')
da.astype('uint32').to_netcdf("baz")
da.astype('uint16').to_netcdf("spam")
# This works:
da.astype('int64').to_netcdf("working64") # is coerced
da.astype('int32').to_netcdf("working32") # works natively
Importantly, this is with the netcdf4 python package not being installed, in which case that package would be used for writing rather than scipy's netcdf.
Expected Output
NetCDF3 file is written with an appropriately coerced data format, e.g. as done with int64
.
Alternatively, writing data fails for all dtypes that would natively be unsupported, including int64
and bool
.
Problem Description
Given that the infrastructure for coercion is already in place, it seems more consistent to me to apply coercion to all cases where it would lead to to_netcdf
method calls succeeding rather than failing, not only to int64
and bool
.
Ideally, coercion would happen towards another unsigned integer type.
However, writing uint32
seems not to be possible, so it's not a 64bit/32bit issue.
While the NetCDF Format Specification declares as only unsigned integer type NON_NEG
, which I presume to be equivalent to uint32
, writing unsigned integers seems not possible via scipy's NetCDF3 writer. Thus, the only viable coercion, as far as I see, would be to signed equivalents.
I see no new cast safety implications here, because the existing coerce_nc3_dtype
function already checks if the original and cast arrays compare equivalently.
Versions
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.6 (default, Dec 30 2019, 19:38:26)
[Clang 11.0.0 (clang-1100.0.33.16)]
python-bits: 64
OS: Darwin
OS-release: 19.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: None
xarray: 0.15.1
pandas: 1.0.1
numpy: 1.18.3
scipy: 1.4.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.10.1
distributed: 2.10.0
matplotlib: 3.1.3
cartopy: None
seaborn: None
numbagg: None
setuptools: 45.1.0
pip: 20.0.2
conda: None
pytest: 5.3.5
IPython: 7.12.0
sphinx: 2.4.1