Skip to content

Periodic Boundary Index #7031

Open
Open
@TomNicholas

Description

@TomNicholas

What is your issue?

I would like to create a PeriodicBoundaryIndex using the Explicit Indexes refactor. I want to do it first in 1D, then 2D, then maybe ND.

I'm thinking this would be useful for:

  1. Geoscientists with periodic longitudes
  2. Any scientists with periodic domains
  3. Road-testing the refactor + how easy the documentation is to follow.

Eventually I think perhaps this index should live in xarray itself? As it's domain-agnostic, doesn't introduce extra dependencies, and could be a conceptually simple example of a custom index.

I had a first go, using the benbovy:add-set-xindex-and-drop-indexes branch, and reading the in-progress docs page. I got a bit stuck early on though.

@benbovy here's what I have so far:

import numpy as np
import pandas as pd
import xarray as xr
from xarray.core.variable import Variable
from xarray.core.indexes import PandasIndex, is_scalar

from typing import Union, Mapping, Any


class PeriodicBoundaryIndex(PandasIndex):
    """
    An index representing any 1D periodic numberline.
    
    Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
    """
        
    def _periodic_subset(self, indxr: Union[int, slice, np.ndarray]) -> pd.Index:
        """Equivalent of __getitem__ for a pd.Index, but respects periodicity."""
        
        length = len(self)
        
        if isinstance(indxr, int):
            return self.index[indxr % length]
        elif isinstance(indxr, slice):
            raise NotImplementedError()
        elif isinstance(indxr, np.ndarray):
            raise NotImplementedError()
        else:
            raise TypeError    
    
    def isel(
        self, indexers: Mapping[Any, Union[int, slice, np.ndarray, Variable]]
    ) -> Union["PeriodicBoundaryIndex", None]:

        print("isel called")

        indxr = indexers[self.dim]
        if isinstance(indxr, Variable):
            if indxr.dims != (self.dim,):
                # can't preserve a index if result has new dimensions
                return None
            else:
                indxr = indxr.data
        if not isinstance(indxr, slice) and is_scalar(indxr):
            # scalar indexer: drop index
            return None

        subsetted_index = self._periodic_subset[indxr]
        return self._replace(subsetted_index)
airtemps = xr.tutorial.open_dataset("air_temperature")['air']

da = airtemps.drop_indexes("lon")

world = da.set_xindex("lon", index_cls=PeriodicBoundaryIndex)

Now selecting a value with isel inside the range works fine, giving the same result same as without my custom index. (The length of the example dataset along lon is 53.)

world.isel(lon=45)
isel called
<xarray.DataArray 'air' (time: 2920, lat: 25)>
...

But indexing with a lon value outside the range of the index data gives an IndexError, seemingly without consulting my new index object. It didn't even print "isel called" 😕 What should I have implemented that I didn't implement?

world.isel(lon=55)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Input In [35], in <cell line: 1>()
----> 1 world.isel(lon=55)

File ~/Documents/Work/Code/xarray/xarray/core/dataarray.py:1297, in DataArray.isel(self, indexers, drop, missing_dims, **indexers_kwargs)
   1292     return self._from_temp_dataset(ds)
   1294 # Much faster algorithm for when all indexers are ints, slices, one-dimensional
   1295 # lists, or zero or one-dimensional np.ndarray's
-> 1297 variable = self._variable.isel(indexers, missing_dims=missing_dims)
   1298 indexes, index_variables = isel_indexes(self.xindexes, indexers)
   1300 coords = {}

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:1233, in Variable.isel(self, indexers, missing_dims, **indexers_kwargs)
   1230 indexers = drop_dims_from_indexers(indexers, self.dims, missing_dims)
   1232 key = tuple(indexers.get(dim, slice(None)) for dim in self.dims)
-> 1233 return self[key]

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:793, in Variable.__getitem__(self, key)
    780 """Return a new Variable object whose contents are consistent with
    781 getting the provided key from the underlying data.
    782 
   (...)
    790 array `x.values` directly.
    791 """
    792 dims, indexer, new_order = self._broadcast_indexes(key)
--> 793 data = as_indexable(self._data)[indexer]
    794 if new_order:
    795     data = np.moveaxis(data, range(len(new_order)), new_order)

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:657, in MemoryCachedArray.__getitem__(self, key)
    656 def __getitem__(self, key):
--> 657     return type(self)(_wrap_numpy_scalars(self.array[key]))

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:626, in CopyOnWriteArray.__getitem__(self, key)
    625 def __getitem__(self, key):
--> 626     return type(self)(_wrap_numpy_scalars(self.array[key]))

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:533, in LazilyIndexedArray.__getitem__(self, indexer)
    531     array = LazilyVectorizedIndexedArray(self.array, self.key)
    532     return array[indexer]
--> 533 return type(self)(self.array, self._updated_key(indexer))

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:505, in LazilyIndexedArray._updated_key(self, new_key)
    503         full_key.append(k)
    504     else:
--> 505         full_key.append(_index_indexer_1d(k, next(iter_new_key), size))
    506 full_key = tuple(full_key)
    508 if all(isinstance(k, integer_types + (slice,)) for k in full_key):

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:278, in _index_indexer_1d(old_indexer, applied_indexer, size)
    276         indexer = slice_slice(old_indexer, applied_indexer, size)
    277     else:
--> 278         indexer = _expand_slice(old_indexer, size)[applied_indexer]
    279 else:
    280     indexer = old_indexer[applied_indexer]

IndexError: index 55 is out of bounds for axis 0 with size 53

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions