Skip to content

Conversation

ahkole
Copy link

@ahkole ahkole commented Apr 30, 2025

These changes are needed in order to wrap the streams in a TextIOWrapper for streaming access to files in text-mode in the disk-objectstore. A related PR is opened in aiida-core that uses this to provide streaming access in text-mode to files in the aiida repository.

Based on my testing, only PackedObjectReader is actually used in practice in my use-case of aiida, but the pre-commit hooks required the properties to be added as well to CallbackStreamWrapper and ZlibLikeBaseStreamDecompresser due to a type union. Also, based on my testing these three properties are the minimal set that was still missing for use with TextIOWrapper.

@ahkole
Copy link
Author

ahkole commented Apr 30, 2025

See aiidateam/aiida-core#6847 for the related PR in aiida-core.

@khsrali khsrali self-assigned this Apr 30, 2025
@ahkole
Copy link
Author

ahkole commented May 1, 2025

I've added some tests for the new attributes.

@ahkole
Copy link
Author

ahkole commented May 1, 2025

Also, I now added these attributes to the stream types that I believe cover the aiida use cases, but I don't know if it makes sense to add them as well to the other stream types in utils.py for consistency?

Copy link

codecov bot commented May 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.67%. Comparing base (e8df2f3) to head (361f332).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #192   +/-   ##
=======================================
  Coverage   99.66%   99.67%           
=======================================
  Files          10       10           
  Lines        2116     2143   +27     
=======================================
+ Hits         2109     2136   +27     
  Misses          7        7           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@khsrali khsrali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ahkole

Thanks a lot for the nice implementation, to me it looks good and reasonable.
I have a few comments on the tests.
Please have a look, once addressed this PR can be merged.

P.S.
I apologize for the delay, I was in a vacation 🥲

Comment on lines +1008 to +1030
def test_packed_object_reader_readable():
"""Test the ``PackedObjectReader.readable`` function."""
with tempfile.TemporaryFile() as handle:
reader = utils.PackedObjectReader(handle, 0, 0)

assert reader.readable()


def test_packed_object_reader_writable():
"""Test the ``PackedObjectReader.writable`` function."""
with tempfile.TemporaryFile() as handle:
reader = utils.PackedObjectReader(handle, 0, 0)

assert not reader.writable()


def test_packed_object_reader_closed():
"""Test the ``PackedObjectReader.closed`` property."""
with tempfile.TemporaryFile() as handle:
reader = utils.PackedObjectReader(handle, 0, 0)

assert not reader.closed
assert reader.closed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can merge this into test_packed_object_reader_mode just put your asserts in the end of that function, and maybe modify the test description.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I'll do this.

temp_container.pack_all_loose(compress=True)
temp_container.clean_storage()

# LazyLooseStream will never be opened
Copy link
Contributor

@khsrali khsrali May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, you want to test here the situation that if LazyLooseStream is in use, but not opened, that should not have a conflict with your method of actually opening an stream, is that right?

In that case, I'd add merge test with around lines 3477, just after:
assert not loosepath.exists()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I wanted to test here was specifically the closed property (which I added) for the case where you opened a ZlibLikeBaseStreamDecompresser that has an internal LazyLooseStream but where you don't wind back the file pointer so that the LazyLooseStream is actually never opened. This was just intended to cover all the bases/situations of the ZlibLikeBaseStreamDecompresser but not sure if this tests adds much or makes sense.

I'm not entirely sure btw what you meant with this:

the situation that if LazyLooseStream is in use, but not opened, that should not have a conflict with your method of actually opening an stream

What are you referring to with 'your method of opening an stream' and what kind of conflict could be there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure btw what you meant with this:

The same thing as you mentioned. I think this test make sense.
However, since the setup is similar, for more readability I'd suggest to move your test here, between:

assert not loosepath.exists()

and

lazy_loose.open_stream()

Comment on lines +3649 to +3652
for _, stream, _ in triplets:
assert isinstance(stream, utils.ZlibLikeBaseStreamDecompresser)
assert not stream.closed
assert stream.closed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many stream will be there? and why to assert stream.closed only the last one?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand get_objects_stream_and_meta correctly, there will be one triplet for each hashkey in the list of arguments. Since I am supplying a single hashkey here I would expect there to only be a single stream. Hence, there is only one to check for closure after the with block. Maybe not the most easy to understand approach, but this was the only routine I could find so far that has a ZlibLikeBaseStreamDecompresser with an internal LazyLooseStream.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation I think it's clear now.
I suggest to get rid of the for loop, as there's only one triplet in the list:

Suggested change
for _, stream, _ in triplets:
assert isinstance(stream, utils.ZlibLikeBaseStreamDecompresser)
assert not stream.closed
assert stream.closed
assert len(triplets) == 1
stream = triplets[0][1]
assert isinstance(stream, utils.ZlibLikeBaseStreamDecompresser)
assert not stream.closed
assert stream.closed

@ahkole
Copy link
Author

ahkole commented May 28, 2025

Hi @khsrali

No worries about the delay. Thanks for having a look at the PR and for your comments. I'm currently busy preparing for a workshop. I'll reply quickly to your comments where I can. The rest I will address after the workshop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants