Skip to content

Dictionary-like functions are broken in groupby.aggregation if they passed not by func argument #2254

@dchigarev

Description

@dchigarev

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Any
  • Modin version (modin.__version__): 0.8.1.1
  • Python version: 3.7.5
  • Code we can use to reproduce:
if __name__ == "__main__":
    import pandas
    import modin.pandas as pd
    import numpy as np

    data = {
        "col1": [0, 1, 2, 3],
        "col2": [4, 5, 6, 7],
        "col3": [8, 9, 12, 10],
        "col4": [17, 13, 16, 15],
        "col5": [-4, -5, -6, -7],
    }

    pd_grp = pandas.DataFrame(data).groupby("col1")
    md_grp = pd.DataFrame(data).groupby("col1")

    pd_res = pd_grp.agg(max=("col3", np.max))
    md_res = md_grp.agg(max=("col3", np.max))

    print(f"Pandas result:\n{pd_res}")
    print(f"\nModin result:\n{md_res}")
Output
Pandas result:
      max
col1
0       8
1       9
2      12
3      10

Modin result:
Empty DataFrame
Columns: []
Index: []

Describe the problem

The root cause seems to be here

if isinstance(func, dict) or func is None:
if func is None:
func = {}

if we're passing our dict functions via kwargs then resulted func-dict will be empty

Metadata

Metadata

Assignees

Labels

P1Important tasks that we should complete soonbug 🦗Something isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions