Description
What happened:
Coordinates added to some variables unexpectedly.
I noticed this after outputting to netCDF. What I have:
variables:
double time(time) ;
time:bounds = "time_bnds" ;
time:axis = "T" ;
time:long_name = "valid_time" ;
time:standard_name = "time" ;
time:units = "days since 1850-01-01" ;
time:calendar = "gregorian" ;
double time_bnds(time, bnds) ;
time_bnds:_FillValue = NaN ;
time_bnds:coordinates = "reftime leadtime height" ;
double lat(lat) ;
lat:bounds = "lat_bnds" ;
lat:units = "degrees_north" ;
lat:axis = "Y" ;
lat:long_name = "latitude" ;
lat:standard_name = "latitude" ;
double lat_bnds(lat, bnds) ;
lat_bnds:_FillValue = NaN ;
lat_bnds:coordinates = "reftime height" ;
double lon(lon) ;
lon:bounds = "lon_bnds" ;
lon:units = "degrees_east" ;
lon:axis = "X" ;
lon:long_name = "Longitude" ;
lon:standard_name = "longitude" ;
double lon_bnds(lon, bnds) ;
lon_bnds:_FillValue = NaN ;
lon_bnds:coordinates = "reftime height" ;
double height ;
height:_FillValue = NaN ;
height:units = "m" ;
height:axis = "Z" ;
height:positive = "up" ;
height:long_name = "height" ;
height:standard_name = "height" ;
float tas(time, lat, lon) ;
tas:_FillValue = 1.e+20f ;
tas:standard_name = "air_temperature" ;
tas:long_name = "Near-Surface Air Temperature" ;
tas:comment = "near-surface (usually, 2 meter) air temperature" ;
tas:units = "K" ;
tas:cell_methods = "area: time: mean" ;
tas:cell_measures = "area: areacella" ;
tas:history = "2019-05-11T15:53:32Z altered by CMOR: Treated scalar dimension: \'height\'. 2019-05-11T15:53:32Z altered by CMOR: Reordered dimensions, original order: lat lon time." ;
tas:coordinates = "height reftime leadtime" ;
tas:missing_value = 1.e+20f ;
int reftime ;
reftime:long_name = "Start date of the forecast" ;
reftime:standard_name = "forecast_reference_time" ;
reftime:units = "days since 1850-01-01" ;
reftime:calendar = "gregorian" ;
double leadtime(time) ;
leadtime:_FillValue = NaN ;
leadtime:long_name = "Time elapsed since the start of the forecast" ;
leadtime:standard_name = "forecast_period" ;
leadtime:units = "days" ;
int realization ;
realization:long_name = "realization" ;
realization:comment = "For more information on the ripf, refer to the variant_label, initialization_description, physics_description and forcing_description global attributes" ;
realization:coordinates = "reftime height" ;
On time_bnds
, lon_bnds
, lat_bnds
and realization
there is coordinates
that I wouldn't expect to be there.
What you expected to happen:
Looking only at the coordinates attribute, I expected my ncdump to show:
variables:
int reftime ;
reftime:long_name = "Start date of the forecast" ;
reftime:standard_name = "forecast_reference_time" ;
reftime:units = "days since 1850-01-01" ;
reftime:calendar = "gregorian" ;
double leadtime(time) ;
leadtime:long_name = "Time elapsed since the start of the forecast" ;
leadtime:standard_name = "forecast_period" ;
leadtime:units = "days" ;
int realization ;
realization:long_name = "realization" ;
realization:comment = "For more information on the ripf, refer to the variant_label, initialization_description, physics_description and forcing_description global attributes" ;
double time(time) ;
time:bounds = "time_bnds" ;
time:axis = "T" ;
time:standard_name = "time" ;
time:units = "days since 1850-01-01" ;
time:calendar = "gregorian" ;
time:long_name = "valid_time" ;
double time_bnds(time, bnds) ;
time_bnds:units = "days since 1850-01-01" ;
double lat(lat) ;
lat:bounds = "lat_bnds" ;
lat:units = "degrees_north" ;
lat:axis = "Y" ;
lat:long_name = "latitude" ;
lat:standard_name = "latitude" ;
double lat_bnds(lat, bnds) ;
double lon(lon) ;
lon:bounds = "lon_bnds" ;
lon:units = "degrees_east" ;
lon:axis = "X" ;
lon:long_name = "Longitude" ;
lon:standard_name = "longitude" ;
double lon_bnds(lon, bnds) ;
double height ;
height:units = "m" ;
height:axis = "Z" ;
height:positive = "up" ;
height:long_name = "height" ;
height:standard_name = "height" ;
float tas(time, lat, lon) ;
tas:standard_name = "air_temperature" ;
tas:long_name = "Near-Surface Air Temperature" ;
tas:comment = "near-surface (usually, 2 meter) air temperature" ;
tas:units = "K" ;
tas:cell_methods = "area: time: mean" ;
tas:cell_measures = "area: areacella" ;
tas:history = "2019-05-11T15:53:32Z altered by CMOR: Treated scalar dimension: \'height\'. 2019-05-11T15:53:32Z altered by CMOR: Reordered dimensions, original order: lat lon time." ;
tas:missing_value = 1.e+20f ;
tas:_FillValue = 1.e+20f ;
tas:coordinates = "height reftime leadtime" ;
I tried to remove this in the xarray dataset, but whatever I tried they always ended up back in there:
>>> import xarray as xr
>>> ds = xr.open_dataset("file.nc", use_cftime=True)
# show coords on realization
>>> ds.realization
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Coordinates:
height float64 ...
reftime object ...
Attributes:
long_name: realization
comment: For more information on the ripf, refer to the variant_label,...
# try reset_coords - removes the coords
>>> ds.realization.reset_coords(names=["height", "reftime"], drop=True)
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Attributes:
long_name: realization
comment: For more information on the ripf, refer to the variant_label,...
# set realization with result of reset_coords
>>> ds["realization"] = ds.realization.reset_coords(names=["height", "reftime"], drop=True)
# coords back in
>>> ds.realization
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Coordinates:
height float64 ...
reftime object ...
Attributes:
long_name: realization
comment: For more information on the ripf, refer to the variant_label,...
# try drop_vars - same thing happens
>>> ds.realization.drop_vars(("height", "reftime"))
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Attributes:
long_name: realization
comment: For more information on the ripf, refer to the variant_label,...
>>> ds["realization"] = ds.realization.drop_vars(("height", "reftime"))
>>> ds.realization
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Coordinates:
height float64 ...
reftime object ...
Attributes:
long_name: realization
comment: For more information on the ripf, refer to the variant_label,...
# tried creating a new variable to see if the same thing happens - it does
>>> ds["test"] = ds.realization.drop_vars(("height", "reftime"))
>>> ds.test
<xarray.DataArray 'test' ()>
array(1, dtype=int32)
Coordinates:
height float64 ...
reftime object ...
Attributes:
long_name: realization
comment: For more information on the ripf, refer to the variant_label,...
This seems like incorrect behaviour, but perhaps it is expected?
Minimal Complete Verifiable Example:
>>> data = xr.DataArray(np.random.randn(2, 3), dims=("x", "y"), coords={"x": [10, 20]})
>>> ds = xr.Dataset({"foo": data, "bar": ("x", [1, 2]), "fake": 10})
>>> ds = ds.assign_coords({"reftime":np.array("2004-11-01T00:00:00", dtype=np.datetime64)})
>>> ds = ds.assign({"test": 1})
>>> ds.test
<xarray.DataArray 'test' ()>
array(1)
Coordinates:
reftime datetime64[ns] 2004-11-01
>>> ds.test.reset_coords(names=["reftime"], drop=True)
<xarray.DataArray 'test' ()>
array(1)
>>> ds["test"] = ds.test.reset_coords(names=["reftime"], drop=True)
>>> ds.test
<xarray.DataArray 'test' ()>
array(1)
Coordinates:
reftime datetime64[ns] 2004-11-01
ds.to_netcdf("file.nc")
ncdump -h file.nc
netcdf file {
dimensions:
x = 2 ;
y = 3 ;
variables:
int64 x(x) ;
double foo(x, y) ;
foo:_FillValue = NaN ;
foo:coordinates = "reftime" ;
int64 bar(x) ;
bar:coordinates = "reftime" ;
int64 fake ;
fake:coordinates = "reftime" ;
int64 reftime ;
reftime:units = "days since 2004-11-01 00:00:00" ;
reftime:calendar = "proleptic_gregorian" ;
int64 test ;
test:coordinates = "reftime" ;
}
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.3 (default, Mar 27 2019, 16:54:48)
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 18.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: ('en_GB', 'UTF-8')
libhdf5: 1.10.5
libnetcdf: 4.6.3
xarray: 0.18.2
pandas: 1.1.3
numpy: 1.19.2
scipy: None
netCDF4: 1.5.4
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.4.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.30.0
distributed: 2.30.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 54.1.1
pip: 21.0.1
conda: None
pytest: 6.2.2
IPython: 7.21.0
sphinx: 1.8.1