Skip to content

gh-112346: Bugfix: Remove faster codepath from gzip.compress as it introduces behavioral inconsistencies #114116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

rhpvorderman
Copy link
Contributor

@rhpvorderman rhpvorderman commented Jan 16, 2024

As mentioned in the linked issue, delegating to zlib.compress causes behavorial changes in the OS byte. As a result this change caused reproducible build failures. The separate codepath caused issues before as well:
#90425

Therefore the codepath should be removed. Speed ain't everything. The mention of zlib.compress in the documentation is maintained so users who not have specific requirements can use the speedier path.

This change needs to be backported to 3.11 and 3.12


📚 Documentation preview 📚: https://cpython-previews--114116.org.readthedocs.build/

@rhpvorderman rhpvorderman changed the title gh-112346: Remove faster codepath from gzip.compress as it introduces behavioral inconsistencies gh-112346: Bugfix: Remove faster codepath from gzip.compress as it introduces behavioral inconsistencies Jan 16, 2024
@rhpvorderman
Copy link
Contributor Author

So is there any procedure for News entries for bugfixes that need to be backported to older versions of CPython?

@serhiy-storchaka serhiy-storchaka self-requested a review January 18, 2024 08:13
@iii-i
Copy link
Contributor

iii-i commented Feb 28, 2024

I would suggest the following fixup to make this work on BE (mtime is always LE, regardless of the host endianness):

--- a/Lib/test/test_gzip.py
+++ b/Lib/test/test_gzip.py
@@ -725,7 +725,7 @@ def test_issue112346(self):
         for mtime in (0, 42):
             with self.subTest(mtime=mtime):
                 compress = gzip.compress(data1, compresslevel=1, mtime=mtime)
-                assert struct.unpack("IxB", compress[4:10]) == (mtime, 255)
+                assert struct.unpack("<IxB", compress[4:10]) == (mtime, 255)
 
     def test_decompress(self):
         for data in (data1, data2):

@rhpvorderman
Copy link
Contributor Author

Thanks for the suggestion @iii-i . That was an oversight on my part. I am going to pin in the hope this fix gets noticed.

@rhpvorderman
Copy link
Contributor Author

ping

@rhpvorderman
Copy link
Contributor Author

Ping @gpshead . Could you shine your light on this? It's only a very small change and it has remained under the radar for almost half a year now.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except using assert in the test.

for mtime in (0, 42):
with self.subTest(mtime=mtime):
compress = gzip.compress(data1, compresslevel=1, mtime=mtime)
assert struct.unpack("<IxB", compress[4:10]) == (mtime, 255)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use TestCase assertion methods instead of the assert statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@rhpvorderman
Copy link
Contributor Author

Thanks for reviewing this so quickly @serhiy-storchaka . I added a news entry.

@gpshead gpshead added needs backport to 3.13 bugs and security fixes needs backport to 3.12 only security fixes labels Jun 12, 2024
@gpshead
Copy link
Member

gpshead commented Jun 12, 2024

I suggest not making this change given how long it has been the behavior of released Pythons. We cannot backport this to 3.11 as it is not a security issue. And anyone picking a change like this up in a patch release is just as likely to run into problems with the behavior change between differing Python versions.

@gpshead gpshead marked this pull request as draft June 12, 2024 21:59
@rhpvorderman
Copy link
Contributor Author

@gpshead That makes sense. Sorry, I went into panic mode when code changes I made gave someone some reproducibility troubles.
I do hope that the two separate code paths do not cause any more in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs backport to 3.12 only security fixes needs backport to 3.13 bugs and security fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants