Skip to content

Conversation

TomAugspurger
Copy link
Contributor

Closes #16519

@TomAugspurger TomAugspurger added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Blocker Blocking issue or pull request for an upcoming release Needs Backport labels May 30, 2017
@TomAugspurger TomAugspurger added this to the 0.20.2 milestone May 30, 2017
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@@ -39,7 +39,7 @@ Bug Fixes

- Bug in using ``pathlib.Path`` or ``py.path.local`` objects with io functions (:issue:`16291`)
- Bug in ``DataFrame.update()`` with ``overwrite=False`` and ``NaN values`` (:issue:`15593`)

- Bug in :func:`pd.unique` on an array of tuples (:issue:`16519`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think has to be :func:`unique` ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct.

@@ -929,6 +929,22 @@ def test_unique_index(self):
tm.assert_numpy_array_equal(case.duplicated(),
np.array([False, False, False]))

@pytest.mark.parametrize('arr, unique', [
([(0, 0), (0, 1), (1, 0), (1, 1), (0, 0), (0, 1), (1, 0), (1, 1)],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add an example of this in pd.unique itself.

@chris-b1
Copy link
Contributor

@jreback
Copy link
Contributor

jreback commented May 30, 2017

I with agree @chris-b1 comment, yes that looks right.

@codecov
Copy link

codecov bot commented May 30, 2017

Codecov Report

Merging #16543 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #16543   +/-   ##
=======================================
  Coverage   90.79%   90.79%           
=======================================
  Files         161      161           
  Lines       51063    51063           
=======================================
  Hits        46365    46365           
  Misses       4698     4698
Flag Coverage Δ
#multiple 88.63% <100%> (ø) ⬆️
#single 40.15% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/algorithms.py 94.41% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e60dc4c...35da5b9. Read the comment docs.

@codecov
Copy link

codecov bot commented May 30, 2017

Codecov Report

Merging #16543 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #16543   +/-   ##
=======================================
  Coverage   90.79%   90.79%           
=======================================
  Files         161      161           
  Lines       51063    51063           
=======================================
  Hits        46365    46365           
  Misses       4698     4698
Flag Coverage Δ
#multiple 88.63% <100%> (ø) ⬆️
#single 40.15% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/algorithms.py 94.41% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e60dc4c...35da5b9. Read the comment docs.

@codecov
Copy link

codecov bot commented May 30, 2017

Codecov Report

Merging #16543 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #16543   +/-   ##
=======================================
  Coverage   90.75%   90.75%           
=======================================
  Files         161      161           
  Lines       51074    51074           
=======================================
  Hits        46353    46353           
  Misses       4721     4721
Flag Coverage Δ
#multiple 88.59% <100%> (ø) ⬆️
#single 40.16% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/algorithms.py 94.41% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 03d44f3...eb7c18f. Read the comment docs.

@TomAugspurger
Copy link
Contributor Author

I think you could back out this change from #16434

The regression test from #16434 fails if I revert the change. The difference being

(Pdb++) values  # this is with the fix reverted
array([[1, 'a']], dtype=object)
(Pdb++) lib.list_to_object_array(list([(1, 'a')]))  # this is the fix from 16434
array([(1, 'a')], dtype=object)

So an array of lists vs. an array of tuples. Is that correct?

@jreback
Copy link
Contributor

jreback commented May 31, 2017

you may just need to call
_ensure_arraylike there instead of the isinstance check

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented May 31, 2017

Using _ensure_arraylike failed on an empty array pd.Series([1, 2]).isin([]) since it's a float instead of object dtype, so the hashing fails later on. I can handle that case if you want, or just leave the isinstance checks. Not sure which is cleaner really.

@jreback
Copy link
Contributor

jreback commented May 31, 2017

@TomAugspurger added a commit. should fix up I think.

@jreback
Copy link
Contributor

jreback commented May 31, 2017

@TomAugspurger looks this broke it. ok just revert my commit and merge your changes. This is a very touchy area. I we are doing the right things in the tests, but just tricky to get exactly right.

@jreback
Copy link
Contributor

jreback commented May 31, 2017

@TomAugspurger rebased to remove my commit.

@TomAugspurger
Copy link
Contributor Author

@jreback are you able to restart appveyor jobs? One of the network tests failed on the first job.

@jreback jreback merged commit 9d7afa7 into pandas-dev:master Jun 1, 2017
@jreback
Copy link
Contributor

jreback commented Jun 1, 2017

thanks!

TomAugspurger added a commit to TomAugspurger/pandas that referenced this pull request Jun 1, 2017
TomAugspurger added a commit that referenced this pull request Jun 4, 2017
@TomAugspurger TomAugspurger deleted the unique-tuples branch June 4, 2017 20:29
Kiv pushed a commit to Kiv/pandas that referenced this pull request Jun 11, 2017
stangirala pushed a commit to stangirala/pandas that referenced this pull request Jun 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Blocker Blocking issue or pull request for an upcoming release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants