Description
What happened?
Hi!
I want to open a zarr dataset lazily.
On my computer:
With numpy==1.26.4
it takes around 1.5sec
With numpy==2.1.1
it takes around 5sec
It's also slow on an ubuntu machine.
Unfortunately, I don't really have the time to deep dive into the issue and pinpoint exactly what is the piece of code that takes much more time than before. As little as I tested, it doesn't seem to come from the http calls.
What did you expect to happen?
I expect that the time to lazily open the dataset is the same whatever the numpy version.
Minimal Complete Verifiable Example
import xarray
import time
top = time.time()
dataset = xarray.open_dataset(
"https://s3.waw3-1.cloudferro.com/mdl-arco-time-035/arco/MEDSEA_MULTIYEAR_PHY_006_004/med-cmcc-cur-rean-h_202012/timeChunked.zarr",
engine="zarr",
)
print(f"Took: {time.time() - top}s")
# with numpy==1.26.4: ~1s
# with numpy==2.1.1: ~5s
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
No response
Anything else we need to know?
No response
Environment
xarray: 2024.9.0
pandas: 2.2.3
numpy: 2.1.1
scipy: None
netCDF4: 1.7.1.post2
pydap: None
h5netcdf: None
h5py: None
zarr: 2.18.3
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.9.0
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.9.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.1.0
pip: 24.0
conda: None
pytest: 8.3.3
mypy: None
IPython: 8.27.0
sphinx: None
None
xarray: 2024.9.0
pandas: 2.2.3
numpy: 1.26.4
scipy: None
netCDF4: 1.7.1.post2
pydap: None
h5netcdf: None
h5py: None
zarr: 2.18.3
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.9.0
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.9.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.1.0
pip: 24.0
conda: None
pytest: 8.3.3
mypy: None
IPython: 8.27.0
sphinx: None
None