Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ Performance Improvements
Bug Fixes
~~~~~~~~~

- Bug in using duplicated with empty dataframes (:issue:`25184`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say more like

:meth:DataFrame.duplicate() on empty dataframe was not return a boolean dtyped Series

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can go in 0.24.2


Categorical
^^^^^^^^^^^

Expand Down
2 changes: 1 addition & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -4636,7 +4636,7 @@ def duplicated(self, subset=None, keep='first'):
from pandas._libs.hashtable import duplicated_int64, _SIZE_HINT_LIMIT

if self.empty:
return Series()
return Series(dtype=bool)

def f(vals):
labels, shape = algorithms.factorize(
Expand Down
6 changes: 6 additions & 0 deletions pandas/tests/frame/test_duplicates.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,12 @@ def test_drop_duplicates():
assert df.duplicated(keep=keep).sum() == 0


def test_duplicated_on_empty_frame_gives_back_frame():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number as a comment here, rename the test to

test_duplicated_on_empty_frame

df = DataFrame(columns=['a', 'b'])
dupes = df.duplicated('a')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result =
expected =
tm.assert_frame_equal(result, expected)

tm.assert_frame_equal(df[dupes], df)


def test_drop_duplicates_with_duplicate_column_names():
# GH17836
df = DataFrame([
Expand Down