Skip to content

Conversation

phofl
Copy link
Member

@phofl phofl commented Nov 27, 2021

This ensures that already existing column names are kept. The index of the duplicated col is one higher in this case

@phofl phofl added Docs IO CSV read_csv, to_csv labels Nov 27, 2021
@jreback jreback added this to the 1.4 milestone Nov 28, 2021
@jreback jreback requested a review from gfyoung November 28, 2021 01:38
@jreback
Copy link
Contributor

jreback commented Nov 28, 2021

@gfyoung if any comments

@phofl phofl changed the title BUG: mangle_dup_cols in read_csv replacing existing cols when conflic with target col BUG: mangle_dup_cols in read_csv replacing existing cols when conflict with target col Nov 28, 2021
Copy link
Member

@gfyoung gfyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

The modified expected results for the tests will take a little getting used to, but they of course make total sense.

@phofl
Copy link
Member Author

phofl commented Nov 28, 2021

Thx

@jreback
Copy link
Contributor

jreback commented Nov 28, 2021

I think its worth making this a sub-section in the whatsnew docs to highlite it.

@phofl
Copy link
Member Author

phofl commented Nov 28, 2021

Added the sub section

@phofl
Copy link
Member Author

phofl commented Nov 28, 2021

@jreback greenish

@jreback jreback merged commit 7f06a8a into pandas-dev:master Nov 28, 2021
@jreback
Copy link
Contributor

jreback commented Nov 28, 2021

thanks @phofl

@jreback
Copy link
Contributor

jreback commented Nov 28, 2021

@phofl
Copy link
Member Author

phofl commented Nov 28, 2021

Are you sure? previous commit was green, only merged master and modified whatsnew afterwards.

@jreback
Copy link
Contributor

jreback commented Nov 28, 2021

not sure at all - if it's not a problem on master then ignore

@phofl
Copy link
Member Author

phofl commented Nov 28, 2021

Have seen similar things in the past like https://github.com/pandas-dev/pandas/runs/4331169003, hence assumed this would be unrelated. Lets see if it passes on master

@phofl phofl deleted the 14704 branch November 28, 2021 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read_csv(...,mangle_dupe_cols=True) causes silent data loss for certain column names. Request introduction of mangle_dupe_cols_str
3 participants