Skip to content

ds.to_zarr(mode="a", append_dim="time") not capturing any time steps under Hours #3379

Closed
@jminsk-cc

Description

@jminsk-cc

MCVE Code Sample

import datetime

import xarray as xr


date = datetime.datetime(2019, 1, 1, 1, 10)
# Reading in 2 min time stepped MRMS data
ds = xr.open_rasterio(dir_path)
ds.name = "mrms"
ds["time"] = date
ds = ds.expand_dims("time")
ds = ds.to_dataset()

ds.to_zarr("fin_zarr", compute=False, mode="w-")

date = datetime.datetime(2019, 1, 1, 1, 12)
# Reading in 2 min time stepped MRMS data
# This can be the same file since we are adding time manually 
ds = xr.open_rasterio(dir_path)
ds.name = "mrms"
ds["time"] = date
ds = ds.expand_dims("time")
ds = ds.to_dataset()

ds.to_zarr("fin_zarr", compute=False, mode="a", append_dim="time")

Expected Output

<xarray.Dataset>
Dimensions:  (band: 1, time: 1, x: 7000, y: 3500)
Coordinates:
  * band     (band) int64 1
  * y        (y) float64 55.0 54.99 54.98 54.97 ... 20.04 20.03 20.02 20.01
  * x        (x) float64 -130.0 -130.0 -130.0 -130.0 ... -60.03 -60.02 -60.01
  * time     (time) datetime64[ns] 2019-01-01T01:10:00
Data variables:
    mrms     (time, band, y, x) uint8 255 255 255 255 255 ... 255 255 255 255

appended by this in a ds.to_zarr()

<xarray.Dataset>
Dimensions:  (band: 1, time: 1, x: 7000, y: 3500)
Coordinates:
  * band     (band) int64 1
  * y        (y) float64 55.0 54.99 54.98 54.97 ... 20.04 20.03 20.02 20.01
  * x        (x) float64 -130.0 -130.0 -130.0 -130.0 ... -60.03 -60.02 -60.01
  * time     (time) datetime64[ns] 2019-01-01T01:12:00
Data variables:
    mrms     (time, band, y, x) uint8 255 255 255 255 255 ... 255 255 255 255

should look like below

<xarray.Dataset>
Dimensions:  (band: 1, time: 2, x: 7000, y: 3500)
Coordinates:
  * band     (band) int64 1
  * time     (time) datetime64[ns] 2019-01-01T01:10:00 2019-01-01T01:12:00
  * x        (x) float64 -130.0 -130.0 -130.0 -130.0 ... -60.03 -60.02 -60.01
  * y        (y) float64 55.0 54.99 54.98 54.97 ... 20.04 20.03 20.02 20.01
Data variables:
    mrms     (time, band, y, x) uint8 dask.array<shape=(2, 1, 3500, 7000), chunksize=(1, 1, 438, 1750)>

Problem Description

The outout looks like this:

<xarray.Dataset>
Dimensions:  (band: 1, time: 2, x: 7000, y: 3500)
Coordinates:
  * band     (band) int64 1
  * time     (time) datetime64[ns] 2019-01-01T01:10:00 2019-01-01T01:10:00
  * x        (x) float64 -130.0 -130.0 -130.0 -130.0 ... -60.03 -60.02 -60.01
  * y        (y) float64 55.0 54.99 54.98 54.97 ... 20.04 20.03 20.02 20.01
Data variables:
    mrms     (time, band, y, x) uint8 dask.array<shape=(2, 1, 3500, 7000), chunksize=(1, 1, 438, 1750)>

Where the minutes are repeated for the whole hour until a new hour is appended. It seems to not be handling minutes correctly.

Output of xr.show_versions()

# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 16:54:48) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1

xarray: 0.12.3
pandas: 0.24.2
numpy: 1.16.4
scipy: 1.3.0
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: 2.3.2
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.0.21
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 2.1.0
distributed: 2.1.0
matplotlib: 3.1.0
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: None
setuptools: 41.0.1
pip: 19.1.1
conda: 4.7.12
pytest: 5.0.1
IPython: 7.6.1
sphinx: 2.1.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugtopic-zarrRelated to zarr storage library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions