Skip to content

Feature request: df that only return dfs when indexing. #3237

Closed
@twiecki

Description

@twiecki

With more complex dataframes, I often stumble over this:

In [1]: x = pandas.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
In [2]: x.ix[0].a
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-8358f19a1e70> in <module>()
----> 1 x.ix[0].a

AttributeError: 'Series' object has no attribute 'a'

In [3]: x.a.ix[0]
Out[3]: 1

This is just an example, at other times I end up converting the Series back to a DataFrame because that's what the rest of the code expects.

I know that there have been attempts to solve this issue by adding attribute lookup to Series (e.g. #1904) but they seem to come with a performance penalty.

Often, however, I care more about expressiveness than performance. I thus propose the addition of an option like:

x = DataFrame(data, slicing_returns_df=True)

Which will cause x.ix[0] or x.a to return again a DataFrame rather than a Series and make the above work. The default would be False so that there are no backward compat issues. Alternatively there could be a new DataFrame class that inherits from DataFrame and has the desired behavior.

I'm happy to gives this a crack, however, I wanted to first make sure that it's not only me who thinks that'd be a good idea or that this can't work for obvious reason X.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions