-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[infra] Upgrade Python to 3.10.14 in base-builder & base-runner Images #12027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[infra] Upgrade Python to 3.10.14 in base-builder & base-runner Images #12027
Conversation
The changes introduced here upgrade Python from 3.8 to 3.10.14 inside the base-builder and base-runner images. ### base-builder changes: Prior to these changes, base-builder compiled Python 3.8 from source using sources downloaded from the official release servers at https://www.python.org/ftp/python/. This updates the compiled version to 3.10.14 (the latest 3.10 release) instead. ### base-runner changes: Prior to these changes, base-runner installed Python 3.8 from the default apt repository provided by the Ubuntu 20.04 image it's based on. These apt repositories do not have a version of Python 3.10 available by default. This updates the base-runner to instead use a multi-stage build to copy the same Python interpreter compiled by the base-builder image into the runner image, which ensures both Python versions remain in-sync while saving build time by re-using a pre-built version. ## Motivation - Code coverage does not work on Python projects that use Python 3.10+ syntax, and will not work until this or similar changes are landed (see google#11419) - Upgrading the base-image to use Ubuntu 22.04 (which provides more recent Python versions via apt) has been stated as being unlikely to happen any time soon (see google#3290) - Many OSS-Fuzz integrated Python projects no longer support Python 3.8 and have resorted to implementing ad-hoc workarounds to upgrade to newer Python versions, including installing Python from the Dead Snakes PPA. - This leads to fragmentation and hard to debug issues. Maintenance is easier when everyone is using the same version without issue. - With [Python 3.8 reaching end of life soon (in 2024-10)][python- versions-EOL], it is likely that more Python projects will begin dropping support for 3.8, further increasing the number of broken builds and ad-hoc workarounds. - Previous attempts at upgrading Python have stalled. ## Known & Expected Issues Several project Dockerfiles and build scripts contain hard coded references to python3.8 file system paths, and many more have implanted ad-hoc workarounds to upgrade to newer Python versions than 3.8 (typically 3.9.) Additional changes are required to each of these projects to ensure they successfully build after this upgrade to Python 3.10. ### Fuzz Introspector Caveat Fuzz Introspector currently uses Python 3.9. While an upgrade to 3.10 is not expected to introduce any new issues, it was not tested on these changes and may require additional work. ## Possible Areas of Improvement Using the base-builder image in a multi-stage build to copy the pre- compiled Python into base-runner is effective, but feels like a workaround that may be introducing tech debt. A cleaner approach would be to extract the Python compilation into a discrete base image similar to how `base-clang` works, and use that as the multi-stage builder in images that need it. --- Fixes: - google#11419 Supersedes: - google#9532 - google#11420 [python-versions-EOL]: https://devguide.python.org/versions/
|
/gcbrun trial_build.py all --sanitizer coverage address --fuzzing-engine libfuzzer |
1 similar comment
|
/gcbrun trial_build.py all --sanitizer coverage address --fuzzing-engine libfuzzer |
|
Thanks for the runs. I'll check the timeouts in about 24 hours from now. |
`MarkupSafe` is a transitive dependency through `code_coverage`'s Jinja2 requirement. The previously pinned version, `MarkupSafe==0.23`, is incompatible with Python 3.10 raising the following error: ``` ImportError: cannot import name 'Mapping' from 'collections' ``` Upgrading MarkupSafe to a compatible version requires `code_coverage`'s Jinja2 requirement to be bumped from Jinja2==2.10 to 2.10.3 The `sed` change introduced here is not ideal, but is required until the upstream requirement is bumped. At that point, the `sed` should become a no-op.
|
@jonathanmetzman I think e1a6e9f should fix the broken coverage builds. |
|
/gcbrun trial_build.py all --sanitizer coverage address --fuzzing-engine libfuzzer |
|
Thanks for grabbing that list @DonggeLiu! I've started looking over them and will report back. The few I've seen so far appear to be caused by ad-hock Python version updated in build scripts which is reassuring. FYI for anyone interested: Here's the list formatted as table: Failures Table
|
Updated the hook_pre_exec_eval function in command_injection.py to accept additional arguments (*args, **kwargs). This resolves a TypeError encountered in Python 3.10 where the function was called with more arguments than expected. The change ensures compatibility with Python 3.10 by aligning the function signature with the arguments passed by the add_hook mechanism. Also replaces the deprecated `importlib.find_loader` methoc call with the recommended ` importlib.util.find_spec` alternative. These changes were tested by running the "proof-of-exploit" examples, the pyscan tests in this project, and by running `check_build` on several projects (such as `black`) that enable Pyscan.
Atheris: Among many useful patches, the Python 3.10 compatability fixes in v2.2.2 are of particular note. See https://github.com/google/atheris/releases/tag/2.2.2 Pyinstaller: Dependency collection was improved significantly between Pyintstaller v5 and v6, in both the core library, and the more recent `pyinstaller-hooks-contrib` package it ships with. Pyinstaller versions 3.9.0 & 3.10.0 are particularly noteworthy. 3.9.0 includes updates for scipy, numpy 2.0.0, and Django to fix compatibility issues. 3.10.0 implements support for setuptools >= 71.0.0 and its new approach to vendoring its dependencies. See: https://setuptools.pypa.io/en/latest/history.html Setuptools: Many projects expect a more recent version of setuptools than was previously installed, including the pyscanner sanatizer from this repo: `infra/base-images/base-builder/sanitizers/pysecsan/`
Fixes `SetuptoolsDeprecationWarning` warnings during Pyscan installation. See: - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html - pypa/setuptools#917
|
@jonathanmetzman @DonggeLiu Could you please start a trial build with the latest changes? Thanks in advance! The following information provides context for the changes and is intended for reference. Analysis of Recent FailuresAfter reviewing the recent run failures, here's what I found:
JVM/Java project faluresThe failures in JVM/Java projects appear to be network errors unrelated to this PR. It seems that |
Ensures the CI actions use the same Python version as OSS-Fuzz images.
Versions with multiple digits after the forst "." in the version number must be quoted strings, otherwise the GH action runner does not read the whole version number and actions fail with an error similar to: > Error: The version '3.1' with architecture 'x64' was not found
Also upgraddes the Cloud SDK version to the latest availiable to attempt to avoid a python 3.10 compat issue:" module 'collections' has no attribute 'MutableMapping'" tracked here: https://issuetracker.google.com/issues/202172882 This also resolves an error in the GH actions prompting for upgrade: > The v0 series of google-github-actions/setup-gcloud is no longer > maintained. It will not receive updates, improvements, > or security patches.
The `>=` was unintentionally changed to `==` in commit: e6fc52c This reverts that change.
for consistentcy with pip commands in other files
The issue these attempted to solve appear to be related to GH Action caching and not the python version, meanwhile upgrading python in these actions introduces additional issues that would need to be addressed. - Revert "Bump Python Version from 3.8 to 3.10 in GitHub Actions" from commit 8b056dc. - Revert "Specify Python Version as Strings" from commit c4957f5. - Revert "Bump google-github-actions/setup-gcloud from v0 to v2" from commit 26a5c01.
In Python 3.10, the fuzz target fails if the `FuzzMsg` class is defined within `TestOneInput` with the exception: ``` TypeError: Couldn't build proto file into descriptor pool: duplicate file name __main____default_package.fuzzmsg.proto ``` This was not an issue in Python 3.8. This change is necessary to remain compatible with the Python version used in the base-builder and base-runner images.
Updates hardocded references to Python 3.8 installation paths to point to the Python 3.10 equivalents instead.
Updates hardcoded references to Python 3.8 installation paths to point to the Python 3.10 equivalents instead.
Installs Python dependencies via `pip` instead of `apt-get` to make build compatible with the Base Image Python 3.10 upgrade.
Replaces the ad-hoc workaround using the Dead Snakes PPA to install Python 3.10, with the upgraded Python 3.10 version provided by the base-builder and base-runner images, the latter of which resolves the issue mentioned in google#11419. Fixes: google#9638
The Python 3.10 upgrade in the Base Images also updated the default `setuptools` >= 71.0.0 and its new approach to vendoring dependencies that is not supported by the current configparser backport fuzzer implementations. This fix installs a version of setuptools that is known to work.
Installs the missing `libzmq3-dev` dependency required to build and install `pyzmq` from a source checkout.
Otherwise, a cached and outdated RUN layer may lead to install failures. Fixes a CI failure caught by `infra/presubmit.py`.
All set, @oliverchang! The
The x86_64 version of that passed. I expect running i386 again will too. |
|
/gcbrun trial_build.py python --sanitizer coverage address --fuzzing-engine libfuzzer |
|
@oliverchang @jonathanmetzman Only 3 projects failed the latest trial build, 2 of which are only coverage failures. So unless I'm mistaken, all 74 failing fuzzer builds for Python projects will be fixed by this PR, and 1 new fuzzer build failure introduced. Next Steps?Considering the above, would you be comfortable with pinning Trial Build resultTrial build end time: 2024-11-20 10:19:26.353697 https://github.com/google/oss-fuzz/pull/12027/checks?check_run_id=33239275606
|
|
Thanks @DaveLak. Pinning pycrypto's base image SGTM. @DavidKorczynski @AdamKorcz is it obvious to you why the two coverage builds are failing in #12027 (comment) ? Once we get these resolved, we would be ready to merge. Thank you once again for all the work you put into this @DaveLak ! |
Pycrypto is incompatible with Python 3.10. Unfortunately, this is unlikely to change as the project is no longer maintained and it's repository has been archived. Pinning the image to remain on Python 3.8 was decided in PR google#12027 to avoid breaking fuzzer builds related to the 3.10 upgrade.
@oliverchang done in a49e0e8. I tested the both building and running locally without any failures. I haven't had the chance to look any further into the coverage failures for Next Steps?Considering this PR fixes the Would you be comfortable pinning |
|
@oliverchang Apologies for the back-to-back pings. I looked into the two remaining unaddressed failures, and it seems like they might be legitimate crashes caused by fuzzed data, unrelated to this PR. I successfully built and ran the fuzzers and generated the coverage reports locally. I'm happy to expand on my findings if needed, but I’ll hold off unless requested since it may be a bit off-topic for the Python upgrade. From my perspective, I believe this PR should be ready to merge. What do you and your team think? |
oliverchang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! happy to proceed with this.
|
Thanks, @oliverchang and everyone who helped land this!🙏 |
|
Thank you @DaveLak for helping get this to the finish line! |
|
Thank you very much!! |
Now that #12027 has been merged to update the base image to Python 3.10, the work from #12546 of upgrading Python to something above 3.8 for Pillow is no longer needed. cc @hugovk Co-authored-by: Andrew Murray <[email protected]>
Note
I was looking for somewhere to get feedback from maintainers about this approach to the Python 3.10 upgrade before attempting it, but the discussion surrounding a Python upgrade has been rather fragmented across many issues, PRs, and comment chains.
For that reason, I felt it would be easier to propose with a working example and dedicated PR.
Fixes:
Supersedes:
Changes
The changes introduced here upgrade Python from 3.8 to 3.10.14 inside the base-builder and base-runner images.
Base Image Changes
Known Impact on Projects
3.9 Workarounds That Can Be Removed
Anticipated Build Failures
Preexisting Failures
Fix is Prepared
Fix Requires Upstream Changes
archinfodependency requires >=3.10. Fails after the 3.10 upgrade because the upstream build script needspython3.9replaced withpython3.Requires More Investigation
TypeError: Parser.non_math() takes 2 positional arguments but 4 were given" in "File "fuzz_plt.py", line 43, in TestOneInput.export LDFLAGS="-fuse-ld=lld"is set, the error becomes: "ld.lld: error: undefined symbol: __asan_report_store4".build.shis the issue.SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats". Seems like the issue described here. Pycrypto is deprecated and this is unlikely to be fixed upstream.Possible Future Improvements
Using the base-builder image in a multi-stage build to copy the pre- compiled Python into base-runner is effective, but feels like a workaround that may be introducing tech debt. A cleaner approach would be to extract the Python compilation into a discrete base image similar to how
base-clangworks, and use that as the multi-stage builder in images that need it.Fuzz Introspector Caveat
Fuzz Introspector currently uses Python 3.9. While an upgrade to 3.10 is not expected to introduce any new issues, it was not tested on these changes and may require additional work.
Motivation
numpy,scipy,pandas, etc.)