Skip to content

Can't remove coordinates attribute from DataArrays  #5510

@ellesmith88

Description

@ellesmith88

What happened:
Coordinates added to some variables unexpectedly.

I noticed this after outputting to netCDF. What I have:

variables:
        double time(time) ;
                time:bounds = "time_bnds" ;
                time:axis = "T" ;
                time:long_name = "valid_time" ;
                time:standard_name = "time" ;
                time:units = "days since 1850-01-01" ;
                time:calendar = "gregorian" ;
        double time_bnds(time, bnds) ;
                time_bnds:_FillValue = NaN ;
                time_bnds:coordinates = "reftime leadtime height" ;
        double lat(lat) ;
                lat:bounds = "lat_bnds" ;
                lat:units = "degrees_north" ;
                lat:axis = "Y" ;
                lat:long_name = "latitude" ;
                lat:standard_name = "latitude" ;
        double lat_bnds(lat, bnds) ;
                lat_bnds:_FillValue = NaN ;
                lat_bnds:coordinates = "reftime height" ;
        double lon(lon) ;
                lon:bounds = "lon_bnds" ;
                lon:units = "degrees_east" ;
                lon:axis = "X" ;
                lon:long_name = "Longitude" ;
                lon:standard_name = "longitude" ;
        double lon_bnds(lon, bnds) ;
                lon_bnds:_FillValue = NaN ;
                lon_bnds:coordinates = "reftime height" ;
        double height ;
                height:_FillValue = NaN ;
                height:units = "m" ;
                height:axis = "Z" ;
                height:positive = "up" ;
                height:long_name = "height" ;
                height:standard_name = "height" ;
        float tas(time, lat, lon) ;
                tas:_FillValue = 1.e+20f ;
                tas:standard_name = "air_temperature" ;
                tas:long_name = "Near-Surface Air Temperature" ;
                tas:comment = "near-surface (usually, 2 meter) air temperature" ;
                tas:units = "K" ;
                tas:cell_methods = "area: time: mean" ;
                tas:cell_measures = "area: areacella" ;
                tas:history = "2019-05-11T15:53:32Z altered by CMOR: Treated scalar dimension: \'height\'. 2019-05-11T15:53:32Z altered by CMOR: Reordered dimensions, original order: lat lon time." ;
                tas:coordinates = "height reftime leadtime" ;
                tas:missing_value = 1.e+20f ;
        int reftime ;
                reftime:long_name = "Start date of the forecast" ;
                reftime:standard_name = "forecast_reference_time" ;
                reftime:units = "days since 1850-01-01" ;
                reftime:calendar = "gregorian" ;
        double leadtime(time) ;
                leadtime:_FillValue = NaN ;
                leadtime:long_name = "Time elapsed since the start of the forecast" ;
                leadtime:standard_name = "forecast_period" ;
                leadtime:units = "days" ;
        int realization ;
                realization:long_name = "realization" ;
                realization:comment = "For more information on the ripf, refer to the variant_label, initialization_description, physics_description and forcing_description global attributes" ;
                realization:coordinates = "reftime height" ;

On time_bnds, lon_bnds, lat_bnds and realization there is coordinates that I wouldn't expect to be there.

What you expected to happen:
Looking only at the coordinates attribute, I expected my ncdump to show:

variables:
        int reftime ;
                reftime:long_name = "Start date of the forecast" ;
                reftime:standard_name = "forecast_reference_time" ;
                reftime:units = "days since 1850-01-01" ;
                reftime:calendar = "gregorian" ;
        double leadtime(time) ;
                leadtime:long_name = "Time elapsed since the start of the forecast" ;
                leadtime:standard_name = "forecast_period" ;
                leadtime:units = "days" ;
        int realization ;
                realization:long_name = "realization" ;
                realization:comment = "For more information on the ripf, refer to the variant_label, initialization_description, physics_description and forcing_description global attributes" ;
        double time(time) ;
                time:bounds = "time_bnds" ;
                time:axis = "T" ;
                time:standard_name = "time" ;
                time:units = "days since 1850-01-01" ;
                time:calendar = "gregorian" ;
                time:long_name = "valid_time" ;
        double time_bnds(time, bnds) ;
                time_bnds:units = "days since 1850-01-01" ;
        double lat(lat) ;
                lat:bounds = "lat_bnds" ;
                lat:units = "degrees_north" ;
                lat:axis = "Y" ;
                lat:long_name = "latitude" ;
                lat:standard_name = "latitude" ;
        double lat_bnds(lat, bnds) ;
        double lon(lon) ;
                lon:bounds = "lon_bnds" ;
                lon:units = "degrees_east" ;
                lon:axis = "X" ;
                lon:long_name = "Longitude" ;
                lon:standard_name = "longitude" ;
        double lon_bnds(lon, bnds) ;
        double height ;
                height:units = "m" ;
                height:axis = "Z" ;
                height:positive = "up" ;
                height:long_name = "height" ;
                height:standard_name = "height" ;
        float tas(time, lat, lon) ;
                tas:standard_name = "air_temperature" ;
                tas:long_name = "Near-Surface Air Temperature" ;
                tas:comment = "near-surface (usually, 2 meter) air temperature" ;
                tas:units = "K" ;
                tas:cell_methods = "area: time: mean" ;
                tas:cell_measures = "area: areacella" ;
                tas:history = "2019-05-11T15:53:32Z altered by CMOR: Treated scalar dimension: \'height\'. 2019-05-11T15:53:32Z altered by CMOR: Reordered dimensions, original order: lat lon time." ;
                tas:missing_value = 1.e+20f ;
                tas:_FillValue = 1.e+20f ;
                tas:coordinates = "height reftime leadtime" ;

I tried to remove this in the xarray dataset, but whatever I tried they always ended up back in there:

>>> import xarray as xr
>>> ds = xr.open_dataset("file.nc", use_cftime=True)

# show coords on realization
>>> ds.realization
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Coordinates:
    height   float64 ...
    reftime  object ...
Attributes:
    long_name:  realization
    comment:    For more information on the ripf, refer to the variant_label,...

# try reset_coords - removes the coords
>>> ds.realization.reset_coords(names=["height", "reftime"], drop=True)
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Attributes:
    long_name:  realization
    comment:    For more information on the ripf, refer to the variant_label,...

# set realization with result of reset_coords
>>> ds["realization"] = ds.realization.reset_coords(names=["height", "reftime"], drop=True)

# coords back in
>>> ds.realization
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Coordinates:
    height   float64 ...
    reftime  object ...
Attributes:
    long_name:  realization
    comment:    For more information on the ripf, refer to the variant_label,...

# try drop_vars - same thing happens
>>> ds.realization.drop_vars(("height", "reftime"))
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Attributes:
    long_name:  realization
    comment:    For more information on the ripf, refer to the variant_label,...
    
>>> ds["realization"] = ds.realization.drop_vars(("height", "reftime"))

>>> ds.realization
<xarray.DataArray 'realization' ()>
array(1, dtype=int32)
Coordinates:
    height   float64 ...
    reftime  object ...
Attributes:
    long_name:  realization
    comment:    For more information on the ripf, refer to the variant_label,...
    
# tried creating a new variable to see if the same thing happens  - it does
>>> ds["test"] = ds.realization.drop_vars(("height", "reftime")) 
>>> ds.test
<xarray.DataArray 'test' ()>
array(1, dtype=int32)
Coordinates:
    height   float64 ...
    reftime  object ...
Attributes:
    long_name:  realization
    comment:    For more information on the ripf, refer to the variant_label,...

This seems like incorrect behaviour, but perhaps it is expected?

Minimal Complete Verifiable Example:

>>> data = xr.DataArray(np.random.randn(2, 3), dims=("x", "y"), coords={"x": [10, 20]})
>>> ds = xr.Dataset({"foo": data, "bar": ("x", [1, 2]), "fake": 10})
>>> ds = ds.assign_coords({"reftime":np.array("2004-11-01T00:00:00", dtype=np.datetime64)}) 
>>> ds = ds.assign({"test": 1})

>>> ds.test
<xarray.DataArray 'test' ()>
array(1)
Coordinates:
    reftime  datetime64[ns] 2004-11-01
    
>>> ds.test.reset_coords(names=["reftime"], drop=True)
<xarray.DataArray 'test' ()>
array(1)

>>> ds["test"] = ds.test.reset_coords(names=["reftime"], drop=True)
>>> ds.test
<xarray.DataArray 'test' ()>
array(1)
Coordinates:
    reftime  datetime64[ns] 2004-11-01
    
 ds.to_netcdf("file.nc")
ncdump -h file.nc
netcdf file {
dimensions:
	x = 2 ;
	y = 3 ;
variables:
	int64 x(x) ;
	double foo(x, y) ;
		foo:_FillValue = NaN ;
		foo:coordinates = "reftime" ;
	int64 bar(x) ;
		bar:coordinates = "reftime" ;
	int64 fake ;
		fake:coordinates = "reftime" ;
	int64 reftime ;
		reftime:units = "days since 2004-11-01 00:00:00" ;
		reftime:calendar = "proleptic_gregorian" ;
	int64 test ;
		test:coordinates = "reftime" ;
}

Environment:

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.3 (default, Mar 27 2019, 16:54:48)
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 18.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: ('en_GB', 'UTF-8')
libhdf5: 1.10.5
libnetcdf: 4.6.3

xarray: 0.18.2
pandas: 1.1.3
numpy: 1.19.2
scipy: None
netCDF4: 1.5.4
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.4.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.30.0
distributed: 2.30.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 54.1.1
pip: 21.0.1
conda: None
pytest: 6.2.2
IPython: 7.21.0
sphinx: 1.8.1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions