Skip to content

clean up the API for renaming and changing dimensions / coordinates #4825

Open
@keewis

Description

@keewis

From #4108:

I wonder if it would be better to first "reorganize" all of the existing functions: we currently have rename (and Dataset.rename_dims / Dataset.rename_vars), set_coords, reset_coords, set_index, reset_index and swap_dims, which overlap partially. For example, the code sample from #4417 works if instead of

ds = ds.rename(b='x')
ds = ds.set_coords('x')

we use

ds = ds.set_index(x="b")

and something similar for the code sample in #4107.

I believe we currently have these use cases (not sure if that list is complete, though):

  • rename a DataArrayrename
  • rename a existing variable to a name that is not yet in the object → rename / Dataset.rename_vars / Dataset.rename_dims
  • convert a data variable to a coordinate (not a dimension coordinate) → set_coords
  • convert a coordinate (not a dimension coordinate) to a data variable → reset_coords
  • swap a existing dimension coordinate with a coordinate (which may not exist) and rename the dimension → swap_dims
  • use a existing coordinate / data variable as a dimension coordinate (do not rename the dimension) → set_index
  • stop using a coordinate as dimension coordinate and append _ to its name (do not rename the dimension) → reset_index
  • use two existing coordinates / data variables as a MultiIndex → set_index
  • stop using a MultiIndex as a dimension coordinate and use its levels as coordinates → reset_index

Sometimes, some of these can be emulated by combinations of others, for example:

# x is a dimension without coordinates
assert_identical(ds.set_index({"x": "b"}), ds.swap_dims({"x": "b"}).rename({"b": "x"}))
assert_identical(ds.swap_dims({"x": "b"}), ds.set_index({"x": "b"}).rename({"x": "b"}))

and, with this PR:

assert_identical(ds.set_index({"x": "b"}), ds.set_coords("b").rename({"b": "x"}))
assert_identical(ds.swap_dims({"x": "b"}), ds.rename({"b": "x"}))

which means that it would increase the overlap of rename, set_index, and swap_dims.

In any case I think we should add a guide which explains which method to pick in which situation (or extend howdoi).

Originally posted by @keewis in #4108 (comment)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions