-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
When using Dataset.scatter with hue being a variable of dtype string, the legend turns out to be wrong.
MCVE Code Sample
import xarray as xr
import numpy as np
dd = xr.Dataset({'y': (['x'], np.arange(8)),
'label': (['x'], list('AABBCCDD'))},
coords={'x': np.linspace(0,1,8)})
dd.plot.scatter(x='x', y='y', hue='label')
Playing around it seems that it always chooses the first 4 values as the legend labels (note that the order of colors of the points is correct):
import xarray as xr
import numpy as np
dd = xr.Dataset({'y': (['x'], np.arange(8)),
'label': (['x'], list('ABBACDDC'))},
coords={'x': np.linspace(0,1,8)})
dd.plot.scatter(x='x', y='y', hue='label')
And if there are only 3 labels in total it chooses the first 3:
import xarray as xr
import numpy as np
dd = xr.Dataset({'y': (['x'], np.arange(6)),
'label': (['x'], list('ABBACC'))},
coords={'x': np.linspace(0,1,6)})
dd.plot.scatter(x='x', y='y', hue='label')
Expected Output
Legend in first two plots should read 'ABCD' and last plot 'ABC'
Versions
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
libhdf5: 1.10.4
libnetcdf: None
xarray: 0.15.1
pandas: 0.25.1
numpy: 1.17.2
scipy: 1.3.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 2.5.2
distributed: 2.5.2
matplotlib: 3.1.1
cartopy: None
seaborn: 0.9.0
numbagg: None
setuptools: 41.4.0
pip: 19.2.3
conda: 4.8.1
pytest: 5.2.1
IPython: 7.8.0
sphinx: 2.2.0