-
Notifications
You must be signed in to change notification settings - Fork 9
fix is_time to avoid memory overload #397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
Pull Request Test Coverage Report for Build 13905960780Details
💛 - Coveralls |
clisops/utils/dataset_utils.py
Outdated
def is_time(coord): | ||
""" | ||
Determines if a coordinate is time. | ||
|
||
:param coord: coordinate of xarray dataset e.g. coord = ds.coords[coord_id] | ||
:return: (bool) True if the coordinate is time. | ||
""" | ||
if coord.ndim >= 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it really be False
by default for time_bnds
, so should time_bnds
not be considered as time
coordinate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know ... but if I skip the check/filter than I have to deal with it at other places ... see L106.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In l64 in function get_coord_by_type
one could replace
coord_vars = list(ds.coords) + list(ds.data_vars)
with coord_vars = (list(ds.coords) + list(ds.data_vars)).remove(get_main_variable(ds))
, then is_time
should no longer be run for variables that do not fit in memory, as it was before.
clisops/utils/dataset_utils.py
Outdated
return coord_id, [x for x in coords if x != coord_id] | ||
else: | ||
return coord_id | ||
if coord_id in ds.coords: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sure coord_id is in ds.coords (lat_bnds is not)
I have added this. there was another issue that |
@sol1105 tests are working now. We have updated to the latests Not sure how we solve this in future. We could patch |
@cehbrecht I asked my colleague @aulemahal to look into the issue on the For reference: pydata/xarray#8821 |
@Zeitsperre @sol1105 good to go? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Small changes.
@cehbrecht Don't forget to update the |
Co-authored-by: Trevor James Smith <[email protected]>
for more information, see https://pre-commit.ci
Prepare v0.16.0
Pull Request Checklist:
What kind of change does this PR introduce?:
This is a fix for
is_time
function to avoid memory overload when usingcoord.values
for a data variable.Does this PR introduce a breaking change?:
Other information:
This issue occurred when testing atlas-v2 data with very large datasets (several GBs):
https://github.com/roocs/rook/blob/dev-atlas-v2/notebooks/atlas-v2.ipynb