Skip to content

BUG: df.from_records() is all kinds of broken #2179

Closed
@ghost

Description

  • the columns argument gets clobbered:
    In [13]: pd.DataFrame.from_records({1:["foo"],2:["bar"]},columns=['a','b']).columns
    Out[13]: Index([1, 2], dtype=int64)
  • if index is specified, and result_index computed, it will get clobbered
    later on.
  • if index is specified as a list of labels, with the first few matching columns names
    and the others not, then sdict will get mutilated, because the removal of columns
    occurs within the try clause rather then after success.
    EDIT - ignore, I misread the code.
  • There's duplication against the main Dataframe ctor which also accepts dicts and arrays,
    and The exclusion and float coercsion which are unique to from_records() would be useful
    to have generally available in the main ctor anyway.
  • It's unclear if the columns argument should be specified relative to the original
    data, or relative to the data modulo excluded columns
  • Doesn't support duplicate column names (Although that's checked with
    a warning)
  • the docstring specifies the datatypes for data , which do not include a dict.
    But the code specifically checks and handles dict input, This is a minor thing, but
    using columns along with dict is not well defined because of key (non-)ordering,
    so the original docstring spec seems more sane.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions