Skip to content

Allow .groupby().map() to return scalars? #9544

Closed
@max-sixty

Description

@max-sixty

Is your feature request related to a problem?

I'm trying to get a count of unique values along a dimension. It's not so easy, unless I'm missing something.

One approach is:

da = xr.tutorial.load_dataset('air_temperature').to_dataarray()

xr.apply_ufunc(lambda x: len(np.unique(x)), da, input_core_dims=[["lat","lon"]], vectorize=True)  # NB: requires vectorize to work!

but apply_ufunc is generally too complex for normal users to use, I think.

Another approach could be

da.groupby('time').map(lambda x: len(np.unique(x)))

But this raises:

AttributeError: 'int' object has no attribute 'dims'

Instead, surrounding the expression with DataArray makes it work:

da.groupby('uptr').map(lambda x: xr.DataArray(len(np.unique(x))))

<xarray.DataArray (time: 2920)> Size: 23kB
array([546, 547, 545, ..., 555, 558, 566])
Coordinates:
  * time     (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00

Describe the solution you'd like

Should we allow returning scalars from .groupby().map()?

I don't think there can be any ambiguity on what the result should be...

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions