feat: re-implement delvewheel for repairing Windows wheels#3116
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Re-implements Windows wheel repairing (delvewheel-like) by detecting external DLL dependencies, bundling them into the wheel, patching PE import tables, and injecting a runtime __init__.py shim for DLL discovery.
Changes:
- Added a Windows wheel repairer that audits PE dependencies and patches import tables to reference bundled, renamed DLLs.
- Added PE patching utilities to rewrite PE import tables and clear load flags/signatures as needed.
- Added deferred “prepend to file” support in
VirtualWriterto inject code after Pythonfrom __future__imports.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/module_writer/virtual_writer.rs |
Adds deferred prepend_to support and Python insertion-point logic used for __init__.py patching. |
src/build_context/repair.rs |
Instantiates the Windows repairer and injects the __init__.py DLL-directory patch during repair. |
src/auditwheel/windows.rs |
Introduces WindowsRepairer to find external DLLs, patch imports, and generate the __init__.py patch snippet. |
src/auditwheel/repair.rs |
Extends WheelRepairer with a default init_py_patch hook and updates .libs documentation. |
src/auditwheel/pe_patch.rs |
New PE parser/patcher to rewrite imported DLL names and adjust headers/checksums/signatures. |
src/auditwheel/mod.rs |
Wires in the Windows repairer + PE patch module behind the auditwheel feature. |
Implement pe_patch.rs with pure-byte PE manipulation: - Parse PE32/PE32+ layout, sections, and import tables - Replace imported DLL name RVAs (both regular and delay-load) - Two strategies: reuse section padding (Next Fit) or append new section - Clear DependentLoadFlags in Load Config Directory - Remove Authenticode signatures and fix PE checksum - Unit tests with synthetic PE32+ binaries This is the low-level PE surgery needed by the Windows wheel repairer (delvewheel equivalent), matching delvewheel's pefile-based approach.
Implement WindowsRepairer with WheelRepairer trait: - System DLL filtering with layered detection (API sets, Python DLLs, VC runtime, path-based Windows dir check, curated name list) - find_external_libs() using lddtree for PE dependency tree walking - patch() rewrites PE import tables via pe_patch for both artifacts and cross-references between grafted DLLs - init_py_patch() generates os.add_dll_directory() snippet for __init__.py runtime DLL discovery Also extends WheelRepairer trait with init_py_patch() method (default None, overridden by Windows).
- Add WindowsRepairer to make_repairer() for Windows targets - Add VirtualWriter::prepend_to() for injecting code into tracked files - After grafting DLLs, call init_py_patch() and prepend the os.add_dll_directory() snippet to __init__.py - Feature-gated behind 'auditwheel' feature like macOS repairer
- Fix #1 (critical): prepend_to now defers patches to finish_internal(), avoiding "Generated file was already added" crash when __init__.py patching happens before source files are collected - Fix #2 (high): Python patches are now inserted after 'from __future__' imports via find_python_insertion_point() to avoid SyntaxError - Fix #3 (high): add_new_section_with_names now errors on non-Authenticode overlays instead of silently truncating them - Fix #4 (medium): When PE section headers are full, shift sections down by FileAlignment bytes (matching delvewheel) instead of bailing - Fix #5 (medium): is_python_dll now precisely matches CPython (python[0-9]+t?(_d)?.dll) and PyPy (libpypy*-c.dll) patterns - Expand KNOWN_SYSTEM_DLLS from ~60 to ~90 entries covering crypto, networking, UI, and storage subsystems commonly linked by native extensions
- Fix rva_to_offset to handle BSS sections where virtual_size > raw_data_size - Fix remove_authenticode to use file offsets correctly with saturating arithmetic - Remove non-system DLLs (ole32, winmm, avrt, ucrtbase) from KNOWN_SYSTEM_DLLS - Skip __init__.py patch for root-level artifacts with no package directory - Add warning when prepend_to targets an untracked file - Handle multi-line from __future__ imports (parenthesized and backslash-continued) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Skip libffi*.dll when repairing Windows wheels for PyPy, since PyPy ships libffi-8.dll as part of its distribution - Gate the exclusion to only apply when building for PyPy interpreters - Fix clippy iter_skip_next warnings in virtual_writer.rs
- Skip prepend for untracked targets instead of creating new files, avoiding unexpected __init__.py in namespace packages or bin wheels - Fix misleading 'AST-based' doc comment on find_python_insertion_point - Gate __init__.py DLL patch behind !is_bin() for bin bridge wheels - Use checked indexing in PE parsing helpers to return errors instead of panicking on malformed/truncated PE files
The double .next() on the reversed iterator could false-detect a trailing backslash by checking the second-to-last non-whitespace byte. Use .find() instead for a single correct check.
…PE patch - Use data.get(offset..) in read_cstring to avoid panic on malformed PE - Use saturating_sub for header space calculation to prevent underflow
- Narrow WINDIR system DLL detection to only System32/SysWOW64/WinSxS/SysArm32 - Improve find_python_insertion_point to preserve BOM, shebang, encoding, docstring - Add same-name DLL collision detection with actionable error message
The PEP 263 encoding declaration check was matching non-comment lines like 'x = "coding"' because it only looked for 'coding' + ':' or '=' anywhere in the line. Now it correctly requires the line to start with '#' before checking for the encoding pattern.
Replace manual PE header parsing with goblin::pe::PE::parse(): - Use goblin's SectionTable instead of custom SectionInfo - Use goblin's constants (IMAGE_SCN_*, SIZEOF_*, etc.) - Use goblin's header structs for alignment/field access - Bundle SectionTable + header_offset into SectionEntry wrapper - Extract shared parse_import_descriptors() helper for regular and delay-load imports - Keep manual byte-level code only for write operations (patching, checksum, section addition) since goblin is read-only
- virtual_writer: scan up to two lines for PEP 263 encoding cookie, not just the line after shebang - virtual_writer: detect u/U-prefixed triple-quoted docstrings alongside existing r/R prefix handling - repair: pass interpreter to make_repairer in editable and grafting paths so Windows+PyPy correctly excludes libffi - pe_patch: clarify doc that checksum is only updated when non-zero
…slots - Recalculate SizeOfInitializedData as sum of initialized data sections after modifying virtual sizes (matches delvewheel behavior) - Update SizeOfImage based on new virtual sizes - Add sys.version_info check to __init__.py patch for clarity - Document Conda CPython 3.8-3.9 limitation in comments
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.