Description
What is your issue?
I would like to create a PeriodicBoundaryIndex
using the Explicit Indexes refactor. I want to do it first in 1D, then 2D, then maybe ND.
I'm thinking this would be useful for:
- Geoscientists with periodic longitudes
- Any scientists with periodic domains
- Road-testing the refactor + how easy the documentation is to follow.
Eventually I think perhaps this index should live in xarray itself? As it's domain-agnostic, doesn't introduce extra dependencies, and could be a conceptually simple example of a custom index.
I had a first go, using the benbovy:add-set-xindex-and-drop-indexes
branch, and reading the in-progress docs page. I got a bit stuck early on though.
@benbovy here's what I have so far:
import numpy as np
import pandas as pd
import xarray as xr
from xarray.core.variable import Variable
from xarray.core.indexes import PandasIndex, is_scalar
from typing import Union, Mapping, Any
class PeriodicBoundaryIndex(PandasIndex):
"""
An index representing any 1D periodic numberline.
Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
"""
def _periodic_subset(self, indxr: Union[int, slice, np.ndarray]) -> pd.Index:
"""Equivalent of __getitem__ for a pd.Index, but respects periodicity."""
length = len(self)
if isinstance(indxr, int):
return self.index[indxr % length]
elif isinstance(indxr, slice):
raise NotImplementedError()
elif isinstance(indxr, np.ndarray):
raise NotImplementedError()
else:
raise TypeError
def isel(
self, indexers: Mapping[Any, Union[int, slice, np.ndarray, Variable]]
) -> Union["PeriodicBoundaryIndex", None]:
print("isel called")
indxr = indexers[self.dim]
if isinstance(indxr, Variable):
if indxr.dims != (self.dim,):
# can't preserve a index if result has new dimensions
return None
else:
indxr = indxr.data
if not isinstance(indxr, slice) and is_scalar(indxr):
# scalar indexer: drop index
return None
subsetted_index = self._periodic_subset[indxr]
return self._replace(subsetted_index)
airtemps = xr.tutorial.open_dataset("air_temperature")['air']
da = airtemps.drop_indexes("lon")
world = da.set_xindex("lon", index_cls=PeriodicBoundaryIndex)
Now selecting a value with isel inside the range works fine, giving the same result same as without my custom index. (The length of the example dataset along lon
is 53
.)
world.isel(lon=45)
isel called
<xarray.DataArray 'air' (time: 2920, lat: 25)>
...
But indexing with a lon
value outside the range of the index data gives an IndexError
, seemingly without consulting my new index object. It didn't even print "isel called"
😕 What should I have implemented that I didn't implement?
world.isel(lon=55)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Input In [35], in <cell line: 1>()
----> 1 world.isel(lon=55)
File ~/Documents/Work/Code/xarray/xarray/core/dataarray.py:1297, in DataArray.isel(self, indexers, drop, missing_dims, **indexers_kwargs)
1292 return self._from_temp_dataset(ds)
1294 # Much faster algorithm for when all indexers are ints, slices, one-dimensional
1295 # lists, or zero or one-dimensional np.ndarray's
-> 1297 variable = self._variable.isel(indexers, missing_dims=missing_dims)
1298 indexes, index_variables = isel_indexes(self.xindexes, indexers)
1300 coords = {}
File ~/Documents/Work/Code/xarray/xarray/core/variable.py:1233, in Variable.isel(self, indexers, missing_dims, **indexers_kwargs)
1230 indexers = drop_dims_from_indexers(indexers, self.dims, missing_dims)
1232 key = tuple(indexers.get(dim, slice(None)) for dim in self.dims)
-> 1233 return self[key]
File ~/Documents/Work/Code/xarray/xarray/core/variable.py:793, in Variable.__getitem__(self, key)
780 """Return a new Variable object whose contents are consistent with
781 getting the provided key from the underlying data.
782
(...)
790 array `x.values` directly.
791 """
792 dims, indexer, new_order = self._broadcast_indexes(key)
--> 793 data = as_indexable(self._data)[indexer]
794 if new_order:
795 data = np.moveaxis(data, range(len(new_order)), new_order)
File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:657, in MemoryCachedArray.__getitem__(self, key)
656 def __getitem__(self, key):
--> 657 return type(self)(_wrap_numpy_scalars(self.array[key]))
File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:626, in CopyOnWriteArray.__getitem__(self, key)
625 def __getitem__(self, key):
--> 626 return type(self)(_wrap_numpy_scalars(self.array[key]))
File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:533, in LazilyIndexedArray.__getitem__(self, indexer)
531 array = LazilyVectorizedIndexedArray(self.array, self.key)
532 return array[indexer]
--> 533 return type(self)(self.array, self._updated_key(indexer))
File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:505, in LazilyIndexedArray._updated_key(self, new_key)
503 full_key.append(k)
504 else:
--> 505 full_key.append(_index_indexer_1d(k, next(iter_new_key), size))
506 full_key = tuple(full_key)
508 if all(isinstance(k, integer_types + (slice,)) for k in full_key):
File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:278, in _index_indexer_1d(old_indexer, applied_indexer, size)
276 indexer = slice_slice(old_indexer, applied_indexer, size)
277 else:
--> 278 indexer = _expand_slice(old_indexer, size)[applied_indexer]
279 else:
280 indexer = old_indexer[applied_indexer]
IndexError: index 55 is out of bounds for axis 0 with size 53