Skip to content

Binary operations don't broadcast across multiindex #5645

Closed
@roblevy

Description

@roblevy

related #6360

Based on this SO question

Consider the following two Series:

x = pd.DataFrame({'year':[1,1,1,1,2,2,2,2],
                  'country':['A','A','B','B','A','A','B','B'],
                  'prod':[1,2,1,2,1,2,1,2],
                  'val':[10,20,15,25,20,30,25,35]})
x = x.set_index(['year','country','prod']).squeeze()

y = pd.DataFrame({'year':[1,1,2,2],'prod':[1,2,1,2],
                  'mul':[10,0.1,20,0.2]})
y = y.set_index(['year','prod']).squeeze()

which look like:

    year  country  prod
    1     A        1       10
                   2       20
          B        1       15
                   2       25
    2     A        1       20
                   2       30
          B        1       25
                   2       35

year  prod
1     1       10.0
      2        0.1
2     1       20.0
      2        0.2

I find it to be an extremely common task, to perform binary operations by distributing the values of y over a particular level of x. For example, I'd like to multiply all values of product 1 in year 1 by 10.0, regardless of country.

The required result is therefore as follows:

    year  country  prod
    1     A        1       100.0
                   2       2.0
          B        1       150.0
                   2       2.5
    2     A        1       400.0
                   2       6.0
          B        1       500.0
                   2       7.0

The binary operation .mul() doesn't work as expected:

>>> x.mul(y, level=['year','prod'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/series.py", line 334, in f
    return self._binop(other, op, level=level, fill_value=fill_value)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/series.py", line 2075, in _binop
    this, other = self.align(other, level=level, join='outer')
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/series.py", line 2570, in align
    return_indexers=True)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 954, in join
    return_indexers=return_indexers)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 1058, in _join_level
    raise Exception('Join on level between two MultiIndex objects '
Exception: Join on level between two MultiIndex objects is ambiguous

To create the required result, the user currently has to do this:

x = x.reset_index('country').sort_index()
x.val = x.val * y
x = x.reset_index().set_index(['year',
                               'country',
                               'prod']).sortlevel()

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions