Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -245,10 +245,10 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Doctests interval classes' ; echo $MSG
pytest --doctest-modules -v \
pytest -q --doctest-modules \
pandas/core/indexes/interval.py \
pandas/core/arrays/interval.py \
-k"-from_arrays -from_breaks -from_intervals -from_tuples -get_loc -set_closed -to_tuples -interval_range"
-k"-from_arrays -from_breaks -from_intervals -from_tuples -set_closed -to_tuples -interval_range"
RET=$(($RET + $?)) ; echo $MSG "DONE"

fi
Expand Down
132 changes: 130 additions & 2 deletions doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -479,6 +479,133 @@ This change is backward compatible for direct usage of Pandas, but if you subcla
Pandas objects *and* give your subclasses specific ``__str__``/``__repr__`` methods,
you may have to adjust your ``__str__``/``__repr__`` methods (:issue:`26495`).

.. _whatsnew_0250.api_breaking.interval_indexing:


Indexing an ``IntervalIndex`` with ``Interval`` objects
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Indexing methods for :class:`IntervalIndex` have been modified to return exact matches only for :class:`Interval` queries.
``IntervalIndex`` methods previously matched on any overlapping ``Interval``. Behavior with scalar points, e.g. querying
with an integer, is unchanged (:issue:`16316`).

.. ipython:: python

ii = pd.IntervalIndex.from_tuples([(0, 4), (1, 5), (5, 8)])
ii

The ``in`` operator (``__contains__``) now only returns ``True`` for exact matches to ``Intervals`` in the ``IntervalIndex``, whereas
this would previously return ``True`` for any ``Interval`` overlapping an ``Interval`` in the ``IntervalIndex``.

*Previous behavior*:

.. code-block:: python

In [4]: pd.Interval(1, 2, closed='neither') in ii
Out[4]: True

In [5]: pd.Interval(-10, 10, closed='both') in ii
Out[5]: True

*New behavior*:

.. ipython:: python

pd.Interval(1, 2, closed='neither') in ii
pd.Interval(-10, 10, closed='both') in ii

The ``get_loc`` method now only returns locations for exact matches to ``Interval`` queries, as opposed to the previous behavior of
returning locations for overlapping matches. A ``KeyError`` will be raised if an exact match is not found.

*Previous behavior*:

.. code-block:: python

In [6]: ii.get_loc(pd.Interval(1, 5))
Out[6]: array([0, 1])

In [7]: ii.get_loc(pd.Interval(2, 6))
Out[7]: array([0, 1, 2])

*New behavior*:

.. code-block:: python

In [6]: ii.get_loc(pd.Interval(1, 5))
Out[6]: 1

In [7]: ii.get_loc(pd.Interval(2, 6))
---------------------------------------------------------------------------
KeyError: Interval(2, 6, closed='right')

Likewise, ``get_indexer`` and ``get_indexer_non_unique`` will also only return locations for exact matches to ``Interval`` queries, with
``-1`` denoting that an exact match was not found.

These indexing changes extend to querying a :class:`Series` or :class:`DataFrame` with an ``IntervalIndex`` index.

.. ipython:: python

s = pd.Series(list('abc'), index=ii)
s

Selecting from a ``Series`` or ``DataFrame`` using ``[]`` (``__getitem__``) or ``loc`` now only returns exact matches for ``Interval`` queries.

*Previous behavior*:

.. code-block:: python

In [8]: s[pd.Interval(1, 5)]
Out[8]:
(0, 4] a
(1, 5] b
dtype: object

In [9]: s.loc[pd.Interval(1, 5)]
Out[9]:
(0, 4] a
(1, 5] b
dtype: object

*New behavior*:

.. ipython:: python

s[pd.Interval(1, 5)]
s.loc[pd.Interval(1, 5)]

Similarly, non-exact matches will now raise a ``KeyError``.

*Previous behavior*:

.. code-block:: python

In [9]: s[pd.Interval(2, 6)]
Out[9]:
(0, 4] a
(1, 5] b
(5, 8] c
dtype: object

In [10]: s.loc[pd.Interval(2, 6)]
Out[10]:
(0, 4] a
(1, 5] b
(5, 8] c
dtype: object

*New behavior*:

.. code-block:: python

In [6]: s[pd.Interval(2, 6)]
---------------------------------------------------------------------------
KeyError: Interval(2, 6, closed='right')

In [7]: s.loc[pd.Interval(2, 6)]
---------------------------------------------------------------------------
KeyError: Interval(2, 6, closed='right')


.. _whatsnew_0250.api_breaking.deps:

Increased minimum versions for dependencies
Expand Down Expand Up @@ -652,7 +779,8 @@ Categorical

- Bug in :func:`DataFrame.at` and :func:`Series.at` that would raise exception if the index was a :class:`CategoricalIndex` (:issue:`20629`)
- Fixed bug in comparison of ordered :class:`Categorical` that contained missing values with a scalar which sometimes incorrectly resulted in ``True`` (:issue:`26504`)
-
- Bug in :meth:`DataFrame.dropna` when the :class:`DataFrame` has a :class:`CategoricalIndex` containing :class:`Interval` objects incorrectly raised a ``TypeError`` (:issue:`25087`)
- Bug in :class:`Categorical` and :class:`CategoricalIndex` with :class:`Interval` values when using the ``in`` operator (``__contains``) with objects that are not comparable to the values in the ``Interval`` (:issue:`23705`)

Datetimelike
^^^^^^^^^^^^
Expand Down Expand Up @@ -729,7 +857,7 @@ Interval

- Construction of :class:`Interval` is restricted to numeric, :class:`Timestamp` and :class:`Timedelta` endpoints (:issue:`23013`)
- Fixed bug in :class:`Series`/:class:`DataFrame` not displaying ``NaN`` in :class:`IntervalIndex` with missing values (:issue:`25984`)
-
- Bug in :meth:`IntervalIndex.get_loc` where a ``KeyError`` would be incorrectly raised for a decreasing :class:`IntervalIndex` (:issue:`25860`)

Indexing
^^^^^^^^
Expand Down
14 changes: 4 additions & 10 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -3239,8 +3239,9 @@ def reindex(self, target, method=None, level=None, limit=None,
if self.equals(target):
indexer = None
else:

if self.is_unique:
# check is_overlapping for IntervalIndex compat
if (self.is_unique and
not getattr(self, 'is_overlapping', False)):
indexer = self.get_indexer(target, method=method,
limit=limit,
tolerance=tolerance)
Expand Down Expand Up @@ -4902,13 +4903,6 @@ def _searchsorted_monotonic(self, label, side='left'):

raise ValueError('index must be monotonic increasing or decreasing')

def _get_loc_only_exact_matches(self, key):
"""
This is overridden on subclasses (namely, IntervalIndex) to control
get_slice_bound.
"""
return self.get_loc(key)

def get_slice_bound(self, label, side, kind):
"""
Calculate slice bound that corresponds to given label.
Expand Down Expand Up @@ -4942,7 +4936,7 @@ def get_slice_bound(self, label, side, kind):

# we need to look up the label
try:
slc = self._get_loc_only_exact_matches(label)
slc = self.get_loc(label)
except KeyError as err:
try:
return self._searchsorted_monotonic(label, side)
Expand Down
Loading