Skip to content

feat: re-implement delvewheel for repairing Windows wheels#3116

Merged
messense merged 15 commits into
PyO3:mainfrom
messense:delvewheel
Apr 8, 2026
Merged

feat: re-implement delvewheel for repairing Windows wheels#3116
messense merged 15 commits into
PyO3:mainfrom
messense:delvewheel

Conversation

@messense

@messense messense commented Apr 6, 2026

Copy link
Copy Markdown
Member

No description provided.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Re-implements Windows wheel repairing (delvewheel-like) by detecting external DLL dependencies, bundling them into the wheel, patching PE import tables, and injecting a runtime __init__.py shim for DLL discovery.

Changes:

  • Added a Windows wheel repairer that audits PE dependencies and patches import tables to reference bundled, renamed DLLs.
  • Added PE patching utilities to rewrite PE import tables and clear load flags/signatures as needed.
  • Added deferred “prepend to file” support in VirtualWriter to inject code after Python from __future__ imports.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/module_writer/virtual_writer.rs Adds deferred prepend_to support and Python insertion-point logic used for __init__.py patching.
src/build_context/repair.rs Instantiates the Windows repairer and injects the __init__.py DLL-directory patch during repair.
src/auditwheel/windows.rs Introduces WindowsRepairer to find external DLLs, patch imports, and generate the __init__.py patch snippet.
src/auditwheel/repair.rs Extends WheelRepairer with a default init_py_patch hook and updates .libs documentation.
src/auditwheel/pe_patch.rs New PE parser/patcher to rewrite imported DLL names and adjust headers/checksums/signatures.
src/auditwheel/mod.rs Wires in the Windows repairer + PE patch module behind the auditwheel feature.

Comment thread src/module_writer/virtual_writer.rs Outdated
Comment thread src/module_writer/virtual_writer.rs Outdated
Comment thread src/build_context/repair.rs
Comment thread src/auditwheel/pe_patch.rs Outdated
messense and others added 6 commits April 7, 2026 19:35
Implement pe_patch.rs with pure-byte PE manipulation:
- Parse PE32/PE32+ layout, sections, and import tables
- Replace imported DLL name RVAs (both regular and delay-load)
- Two strategies: reuse section padding (Next Fit) or append new section
- Clear DependentLoadFlags in Load Config Directory
- Remove Authenticode signatures and fix PE checksum
- Unit tests with synthetic PE32+ binaries

This is the low-level PE surgery needed by the Windows wheel
repairer (delvewheel equivalent), matching delvewheel's pefile-based
approach.
Implement WindowsRepairer with WheelRepairer trait:
- System DLL filtering with layered detection (API sets, Python DLLs,
  VC runtime, path-based Windows dir check, curated name list)
- find_external_libs() using lddtree for PE dependency tree walking
- patch() rewrites PE import tables via pe_patch for both artifacts
  and cross-references between grafted DLLs
- init_py_patch() generates os.add_dll_directory() snippet for
  __init__.py runtime DLL discovery

Also extends WheelRepairer trait with init_py_patch() method
(default None, overridden by Windows).
- Add WindowsRepairer to make_repairer() for Windows targets
- Add VirtualWriter::prepend_to() for injecting code into tracked files
- After grafting DLLs, call init_py_patch() and prepend the
  os.add_dll_directory() snippet to __init__.py
- Feature-gated behind 'auditwheel' feature like macOS repairer
- Fix #1 (critical): prepend_to now defers patches to finish_internal(),
  avoiding "Generated file was already added" crash when __init__.py
  patching happens before source files are collected

- Fix #2 (high): Python patches are now inserted after 'from __future__'
  imports via find_python_insertion_point() to avoid SyntaxError

- Fix #3 (high): add_new_section_with_names now errors on non-Authenticode
  overlays instead of silently truncating them

- Fix #4 (medium): When PE section headers are full, shift sections down
  by FileAlignment bytes (matching delvewheel) instead of bailing

- Fix #5 (medium): is_python_dll now precisely matches CPython
  (python[0-9]+t?(_d)?.dll) and PyPy (libpypy*-c.dll) patterns

- Expand KNOWN_SYSTEM_DLLS from ~60 to ~90 entries covering crypto,
  networking, UI, and storage subsystems commonly linked by native
  extensions
- Fix rva_to_offset to handle BSS sections where virtual_size > raw_data_size
- Fix remove_authenticode to use file offsets correctly with saturating arithmetic
- Remove non-system DLLs (ole32, winmm, avrt, ucrtbase) from KNOWN_SYSTEM_DLLS
- Skip __init__.py patch for root-level artifacts with no package directory
- Add warning when prepend_to targets an untracked file
- Handle multi-line from __future__ imports (parenthesized and backslash-continued)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Skip libffi*.dll when repairing Windows wheels for PyPy, since PyPy
  ships libffi-8.dll as part of its distribution
- Gate the exclusion to only apply when building for PyPy interpreters
- Fix clippy iter_skip_next warnings in virtual_writer.rs
- Skip prepend for untracked targets instead of creating new files,
  avoiding unexpected __init__.py in namespace packages or bin wheels
- Fix misleading 'AST-based' doc comment on find_python_insertion_point
- Gate __init__.py DLL patch behind !is_bin() for bin bridge wheels
- Use checked indexing in PE parsing helpers to return errors instead
  of panicking on malformed/truncated PE files

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Comment thread src/module_writer/virtual_writer.rs Outdated
Comment thread src/module_writer/virtual_writer.rs Outdated
Comment thread src/module_writer/virtual_writer.rs Outdated
Comment thread src/auditwheel/windows.rs Outdated
The double .next() on the reversed iterator could false-detect a
trailing backslash by checking the second-to-last non-whitespace byte.
Use .find() instead for a single correct check.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comment thread src/auditwheel/windows.rs
Comment thread src/auditwheel/pe_patch.rs Outdated
Comment thread src/auditwheel/pe_patch.rs Outdated
Comment thread src/auditwheel/pe_patch.rs
Comment thread src/module_writer/virtual_writer.rs
messense added 2 commits April 7, 2026 20:49
…PE patch

- Use data.get(offset..) in read_cstring to avoid panic on malformed PE
- Use saturating_sub for header space calculation to prevent underflow
- Narrow WINDIR system DLL detection to only System32/SysWOW64/WinSxS/SysArm32
- Improve find_python_insertion_point to preserve BOM, shebang, encoding, docstring
- Add same-name DLL collision detection with actionable error message

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread src/module_writer/virtual_writer.rs Outdated
messense added 2 commits April 7, 2026 21:44
The PEP 263 encoding declaration check was matching non-comment lines
like 'x = "coding"' because it only looked for 'coding' + ':' or '='
anywhere in the line. Now it correctly requires the line to start with
'#' before checking for the encoding pattern.
Replace manual PE header parsing with goblin::pe::PE::parse():
- Use goblin's SectionTable instead of custom SectionInfo
- Use goblin's constants (IMAGE_SCN_*, SIZEOF_*, etc.)
- Use goblin's header structs for alignment/field access
- Bundle SectionTable + header_offset into SectionEntry wrapper
- Extract shared parse_import_descriptors() helper for regular
  and delay-load imports
- Keep manual byte-level code only for write operations (patching,
  checksum, section addition) since goblin is read-only
@messense messense requested a review from Copilot April 7, 2026 23:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comment thread src/auditwheel/windows.rs
Comment thread src/module_writer/virtual_writer.rs Outdated
Comment thread src/module_writer/virtual_writer.rs
Comment thread src/build_context/repair.rs Outdated
Comment thread src/auditwheel/pe_patch.rs
- virtual_writer: scan up to two lines for PEP 263 encoding cookie,
  not just the line after shebang
- virtual_writer: detect u/U-prefixed triple-quoted docstrings
  alongside existing r/R prefix handling
- repair: pass interpreter to make_repairer in editable and grafting
  paths so Windows+PyPy correctly excludes libffi
- pe_patch: clarify doc that checksum is only updated when non-zero
messense added 2 commits April 8, 2026 21:32
…slots

- Recalculate SizeOfInitializedData as sum of initialized data sections
  after modifying virtual sizes (matches delvewheel behavior)
- Update SizeOfImage based on new virtual sizes
- Add sys.version_info check to __init__.py patch for clarity
- Document Conda CPython 3.8-3.9 limitation in comments
@messense messense marked this pull request as ready for review April 8, 2026 23:47
@messense messense merged commit 70ea112 into PyO3:main Apr 8, 2026
45 checks passed
@messense messense deleted the delvewheel branch April 8, 2026 23:47
@messense messense linked an issue Apr 8, 2026 that may be closed by this pull request
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Audit and repair Windows wheels support

2 participants