Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,7 @@ Timezones
- Bug in :func:`DataFrame.update` when updating with timezone aware data would return timezone naive data (:issue:`25807`)
- Bug in :func:`to_datetime` where an uninformative ``RuntimeError`` was raised when passing a naive :class:`Timestamp` with datetime strings with mixed UTC offsets (:issue:`25978`)
- Bug in :func:`to_datetime` with ``unit='ns'`` would drop timezone information from the parsed argument (:issue:`26168`)
- Bug in :func:`DataFrame.join` where joining a timezone aware index with a timezone aware column would result in a column of ``NaN`` (:issue:`26335`)

Numeric
^^^^^^^
Expand Down
14 changes: 12 additions & 2 deletions pandas/core/reshape/merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -1671,11 +1671,21 @@ def _right_outer_join(x, y, max_groups):
}


def _convert_array_or_index(arg):
"""Converts DatetimeArray or DatetimeIndex to numpy array in UTC"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is so opaque

use getattr(arg, ‘_values’, arg)._data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about extract_array(arg)._data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the simplification. Went with the getattr solution.

try:
# DatetimeIndex case
return arg._values._data
except AttributeError:
# DatetimeArray Case
return arg._data


def _factorize_keys(lk, rk, sort=True):
# Some pre-processing for non-ndarray lk / rk
if is_datetime64tz_dtype(lk) and is_datetime64tz_dtype(rk):
lk = lk._data
rk = rk._data
lk = _convert_array_or_index(lk)
rk = _convert_array_or_index(rk)

elif (is_categorical_dtype(lk) and
is_categorical_dtype(rk) and
Expand Down
5 changes: 3 additions & 2 deletions pandas/tests/reshape/merge/test_join.py
Original file line number Diff line number Diff line change
Expand Up @@ -679,7 +679,7 @@ def test_join_multi_to_multi(self, join_type):
right.join(left, on=['abc', 'xy'], how=join_type)

def test_join_on_tz_aware_datetimeindex(self):
# GH 23931
# GH 23931, 26335
df1 = pd.DataFrame(
{
'date': pd.date_range(start='2018-01-01', periods=5,
Expand All @@ -697,7 +697,8 @@ def test_join_on_tz_aware_datetimeindex(self):
)
result = df1.join(df2.set_index('date'), on='date')
expected = df1.copy()
expected['vals_2'] = pd.Series([np.nan] * len(expected), dtype=object)
expected['vals_2'] = pd.Series([np.nan] * 2 + list('tuv'),
dtype=object)
assert_frame_equal(result, expected)


Expand Down