Import dask and cubed only once #457

eendebakpt · 2025-08-12T21:14:38Z

We recently found a performance bottleneck in the xarray groupby method. Part of the bottleneck was repeated attempted imports of cubed. In this PR we refactor the code to import cubed and dask only once.

for more information, see https://pre-commit.ci

…import

for more information, see https://pre-commit.ci

dcherian · 2025-08-13T03:29:01Z

Nice, thanks. Looks like there's one more mypy thing to fix. I'm happy to immediately release once that is fixed.

We recently found a performance bottleneck in the xarray groupby method.

I'm curious to learn if you've found other things. I spent quite some time optimizing this stuff a couple of years ago.

…import

eendebakpt · 2025-08-13T19:29:32Z

Nice, thanks. Looks like there's one more mypy thing to fix. I'm happy to immediately release once that is fixed.

On my local system mypy is giving different output, so I am doing a bit of trial and error here.

We recently found a performance bottleneck in the xarray groupby method.

I'm curious to learn if you've found other things. I spent quite some time optimizing this stuff a couple of years ago.

Not something specific. Our use case has relatively small arrays, maybe the optimizations have been tested on larger datasets?
Here is a minimal example I extracted:

import xarray as xr
import time
import numpy as np


def method(ds: xr.Dataset):
    pauli_groups = ds.groupby("pauli")
    for _, ds in pauli_groups:
        da = ds.groupby("clifford").mean().counts


k = 12
mult = 30
index = np.arange(k*mult)
bitindex = np.arange(4)
counts = np.random.randint(0, 10, size=(4, index.size))
c = xr.DataArray(counts, coords={'bit_index': bitindex, 'index': index})
ds = xr.Dataset({'counts': c})
ds = ds.assign_coords({'clifford': xr.DataArray(
    np.tile(np.arange(k), mult), dims='index')})
p = xr.DataArray(np.tile(np.arange(4), (k//4) * mult), dims='index')
for ii in range(16):  # the value here has a large impact on the running time
    p.data[ii] = 1000 + ii
ds = ds.assign_coords({'pauli': p})
print(ds)

nn = 20
for ii in range(4):
    t0 = time.time()
    for ii in range(nn):
        method(ds)
    dt = time.time()-t0
    print(f'time {1e3*dt/nn:.2f} [ms / iteration]')

On my system a single call to method takes about 50 ms. I suspect that the equivalent calculation with just numpy arrays is much faster (not a fair comparison, xarray has to keep track of much more data!). Looking at profiling results shows that the call to mean() is actually taking most of the time.

eendebakpt and others added 6 commits August 12, 2025 23:11

Import dask and cubed only once

f02c555

[pre-commit.ci] auto fixes from pre-commit.com hooks

2c657ed

for more information, see https://pre-commit.ci

mypy

59d36b8

mypy

da0c5bd

Merge branch 'cubed_import' of github.com:eendebakpt/flox into cubed_…

83f6485

…import

[pre-commit.ci] auto fixes from pre-commit.com hooks

ad3299f

for more information, see https://pre-commit.ci

eendebakpt added 4 commits August 13, 2025 21:15

mypy

de79993

Merge branch 'cubed_import' of github.com:eendebakpt/flox into cubed_…

4d70eb4

…import

mypy

abbaab5

mypy

c03a80c

dcherian merged commit cbcc035 into xarray-contrib:main Aug 15, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Import dask and cubed only once #457

Import dask and cubed only once #457

Uh oh!

eendebakpt commented Aug 12, 2025

Uh oh!

dcherian commented Aug 13, 2025 •

edited

Loading

Uh oh!

eendebakpt commented Aug 13, 2025

Uh oh!

Uh oh!

Uh oh!

Import dask and cubed only once #457

Import dask and cubed only once #457

Uh oh!

Conversation

eendebakpt commented Aug 12, 2025

Uh oh!

dcherian commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eendebakpt commented Aug 13, 2025

Uh oh!

Uh oh!

Uh oh!

dcherian commented Aug 13, 2025 •

edited

Loading