Skip to content

Adding lambda support inside of __getitem__ for DataFrame, Series, .. etc. #2560

Closed
@spearsem

Description

@spearsem

To avoid the verbose syntax currently needed to select across many columns of a data frame, here's a suggestion. Inside of DataFrame's getitem function, make some special case logic to handle the case where a lambda function is passed in. If a lambda is passed in, then apply it to the dataframe itself and attempt to get the items based on the lambda's result.

Here's an example of what I mean. Suppose that I create a data frame named "dfrm" and it has columns A, B, C, D, and E. Then currently, the following syntax will work to sub-select across conditions on the A and B columns:

dfrm[(lambda x: (x.A < 0) & (x.B > 0))(dfrm)]

By adding the extra handling to getitem, you can remove the need for the the last set of parentheses where dfrm itself is passed as the argument to the lambda. getitem can check for a callable and just always pass itself to the callable, so that the syntax would look like this:

dfrm[(lambda x: (x.A < 0) & (x.B > 0))]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions