Skip to content

fix: copy bin artifacts before auditwheel repair to avoid rerun failures#2969

Merged
messense merged 3 commits into
PyO3:mainfrom
messense:fix-bin-wheel-rerun-external-libs
Feb 8, 2026
Merged

fix: copy bin artifacts before auditwheel repair to avoid rerun failures#2969
messense merged 3 commits into
PyO3:mainfrom
messense:fix-bin-wheel-rerun-external-libs

Conversation

@messense

@messense messense commented Feb 8, 2026

Copy link
Copy Markdown
Member

build_bin_wheel was passing the original cargo build artifact to add_external_libs, which modifies DT_NEEDED entries in-place via patchelf (e.g. libFoo.so -> libFoo-abcd1234.so). On subsequent builds where cargo doesn't recompile, the binary still references the hashed library name which lddtree cannot locate on the system.

Apply the same copy-before-patch pattern already used by compile_cdylib: copy the artifact to target/maturin/ so the original cargo output is never modified.

Fixes #2680

build_bin_wheel was passing the original cargo build artifact to
add_external_libs, which modifies DT_NEEDED entries in-place via
patchelf (e.g. libFoo.so -> libFoo-abcd1234.so). On subsequent builds
where cargo doesn't recompile, the binary still references the hashed
library name which lddtree cannot locate on the system.

Apply the same copy-before-patch pattern already used by
compile_cdylib: copy the artifact to target/maturin/ so the original
cargo output is never modified.

Fixes PyO3#2680
@messense messense requested a review from Copilot February 8, 2026 10:15
Deduplicate the copy-before-patch logic shared by compile_cdylib and
build_bin_wheel into a single method, so future callers get the
protection automatically.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prevents rerun failures on Linux wheel builds by ensuring auditwheel/patchelf modifications are applied to a copied/staged binary artifact rather than the original Cargo build output.

Changes:

  • Refactors artifact staging into a new copy_artifact_for_repair helper used by compile_cdylib.
  • Ensures binary artifacts built in build_bin_wheel are copied to the staging directory before any in-place repair steps can occur.

Comment thread src/build_context.rs
Move the editable/mode guard into copy_artifact_for_repair itself so
the copy is only performed when the mode is Repair (the only mode that
modifies artifacts in-place). This avoids unnecessary work in Check
and Skip modes and simplifies the call sites.
@messense messense merged commit 151b5fa into PyO3:main Feb 8, 2026
79 of 84 checks passed
@messense messense deleted the fix-bin-wheel-rerun-external-libs branch February 8, 2026 11:16
pratyush618 added a commit to pratyush618/maturin that referenced this pull request Apr 25, 2026
Reverts 6c862c4, 4069caf, and e9f7c48.

Hard-linking aliases inodes between the staged artifact and the cargo
output path. patchelf, patch_macho, and pe_patch all mutate the staged
file in place (truncate-and-write), so the cargo output path sees the
patched bytes too. This re-introduces the bug fixed by PyO3#2969 / PyO3#2680.

Reverting cleanly so the follow-up commit can implement the alternative
proposed by @messense: defer the copy-back and only restore the cargo
output path via rename when no patching was needed.
pratyush618 added a commit to pratyush618/maturin that referenced this pull request Apr 25, 2026
Reverts 6c862c4, 4069caf, and e9f7c48.

Hard-linking aliases inodes between the staged artifact and the cargo
output path. patchelf, patch_macho, and pe_patch all mutate the staged
file in place (truncate-and-write), so the cargo output path sees the
patched bytes too. This re-introduces the bug fixed by PyO3#2969 / PyO3#2680.

Reverting cleanly so the follow-up commit can implement the alternative
proposed by @messense: defer the copy-back and only restore the cargo
output path via rename when no patching was needed.
messense pushed a commit to pratyush618/maturin that referenced this pull request Apr 26, 2026
Reverts 6c862c4, 4069caf, and e9f7c48.

Hard-linking aliases inodes between the staged artifact and the cargo
output path. patchelf, patch_macho, and pe_patch all mutate the staged
file in place (truncate-and-write), so the cargo output path sees the
patched bytes too. This re-introduces the bug fixed by PyO3#2969 / PyO3#2680.

Reverting cleanly so the follow-up commit can implement the alternative
proposed by @messense: defer the copy-back and only restore the cargo
output path via rename when no patching was needed.
messense added a commit to messense/maturin that referenced this pull request Apr 26, 2026
Tightens the per-platform `patch` impls so artifacts whose `external_libs`
is empty are left alone, and loosens the default
`WheelRepairer::patch_required` to `!aa.external_libs.is_empty()` for
`Repair`. Together this saves one reflink in `copy_back_cargo_outputs`
plus one `patchelf::set_rpath` / `patch_macho` + `ad_hoc_sign` /
`pe_patch::replace_needed` invocation per "audited but actually
unaffected" artifact in mixed multi-bin wheels.

Before this change every artifact in the audited slice was rewritten:
- ELF: the final RPATH-set loop ran unconditionally on every artifact.
- Mach-O: `install_name_changes` was built from the full grafted name
  map for every artifact, then `patch_macho` + `ad_hoc_sign` were
  invoked even when the artifact's load commands didn't reference any
  grafted install name.
- PE: `pe_patch::replace_needed` was invoked with the full grafted
  replacement list for every artifact, regardless of whether its
  import table actually referenced any of those names.

The new behaviour, scoped per-artifact:
- ELF: skip the RPATH-set when `external_libs.is_empty()`. DT_NEEDED
  rewriting was already gated on per-artifact deps.
- Mach-O: filter `install_name_changes` to entries the artifact
  actually imports (mirroring the existing ELF DT_NEEDED logic), and
  only call `patch_macho` + `ad_hoc_sign` when the filtered list is
  non-empty.
- PE: filter `replacements` per-artifact case-insensitively against
  `aa.external_libs[*].name` and skip `pe_patch::replace_needed` when
  empty.

The new default `patch_required` for `Repair` relies on the invariant
that `prepare_grafted_libs` builds `grafted` from
`audited.iter().flat_map(|a| &a.external_libs)`, so an artifact with
empty `external_libs` is guaranteed to share no install names with the
grafted set and is therefore left untouched by every current `patch`
implementation. `ElfRepairer::patch_required` keeps the override
(duplicating the default Repair arm because Rust traits can't override
a single match arm) only because its `Editable` arm still needs the
`linked_paths`-based prediction.

The two changes ship together: tightening only the default would
regress PyO3#2969 (finalize would rename patched bytes back to the cargo
output path); tightening only the impls without the prediction would
just be dead-code elimination with `copy_back_cargo_outputs` still
paying an unnecessary reflink for the now-unaffected artifacts.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
messense added a commit that referenced this pull request Apr 26, 2026
…#3159)

`fs::rename` in `finalize_staged_artifacts` to skip the full byte copy
on ext4. The trade-off was that whenever auditwheel / delocate /
delvewheel actually rewrote bytes (DT_NEEDED, Mach-O load commands, PE
imports, RPATH), the cargo output path was left empty (same-fs) or
holding the unpatched original (cross-device), and the staged copy was
never restored — see #2969 for why patched bytes must not flow back to
cargo's incremental cache. Tools and CI scripts that read
`target/<target>/release/<bin>` after a patched build saw a missing or
stale file.

Restore the invariant that the cargo output path always holds the clean
unpatched bytes after a successful build, while keeping the rename-back
fast-path for the unpatched case.

The repair pipeline now splits into a detect step and a patch step:

- New `PatchKind<'a>` enum carries the per-call context (`Repair {
grafted, libs_dir, artifact_dir }` or `Editable`) and is used by both
prediction and execution.
- New `WheelRepairer::patch_required(audited, kind) -> Vec<bool>`
predicts, before any bytes are rewritten, which audited artifacts the
upcoming `patch` call will modify in place. Conservative defaults:
all-true on `Repair`, all-false on `Editable`. `ElfRepairer` overrides
the editable arm to track `linked_paths.is_empty()` per-artifact so
multi-bin editable builds where only some bins carry cargo-target search
paths still get the cheap rename-back for the rest.
- `WheelRepairer::patch_editable` is folded into `patch` via
`PatchKind::Editable`, removing the parallel API surface.
- New `copy_back_cargo_outputs` helper in `build_context/repair.rs`
reflinks/copies the still-clean staged artifact back to its cargo output
path for every `will_patch[i] == true` entry, then `take()`s
`cargo_output_path` so finalize won't overwrite the unpatched bytes with
the patched ones. Cross-device staging is a no-op (the original is
already at the cargo path); same-filesystem staging pays one reflink per
genuinely-patched artifact (O(1) on apfs/btrfs/xfs/refs, a real copy on
ext4) — same cost as the pre-#3155 worst case, but scoped only to
artifacts that actually need patching.
- `add_external_libs` calls `patch_required` then
`copy_back_cargo_outputs` then `patch` on both the editable and
non-editable branches. The unconditional `cargo_output_path = None`
clearing loop is gone — clearing is now per-artifact and happens in the
helper.

Net effect:

- Unpatched artifacts: identical to today — single rename-back in
`finalize_staged_artifacts`.
- Patched artifacts (same-fs): one reflink/copy at patch time instead of
the rename-back; cargo output ends up with unpatched bytes.
- Patched artifacts (cross-device): zero extra I/O — `stage_file`
already left the unpatched original at cargo output.

Public auditwheel types (`AuditedArtifact`, `GraftedLib`, `AuditResult`,
`PatchKind`) now derive `Debug` for consistency with the rest of the
public surface.

The follow-up commit tightens each platform's `patch` impl to skip
artifacts whose `external_libs` is empty (ELF: skip `set_rpath`; Mach-O:
filter `install_name_changes` per-artifact and skip `patch_macho` +
`ad_hoc_sign` when the filter is empty; PE: filter `replacements`
per-artifact case-insensitively against `aa.external_libs[*].name` and
skip `pe_patch::replace_needed` when empty), and loosens the default
`patch_required` for `Repair` to `!aa.external_libs.is_empty()`. The two
changes ship together — tightening only the default would regress #2969,
while tightening only the impls would just be dead-code elimination with
`copy_back_cargo_outputs` still paying an unnecessary reflink for the
now-unaffected artifacts.

### Known limitation: dlopen-only dependencies

`external_libs` is populated by static dependency analysis (lddtree on
the artifact's LC_LOAD_DYLIB / DT_NEEDED / PE import table). A library
reached only via runtime loading — `dlopen("libfoo.so", ...)`, `dlsym`,
`LoadLibrary` — won't appear in `external_libs`, so its host artifact
gets `will_patch = false` and is skipped by the tightened `patch` impls.

For Mach-O install-name rewriting and PE import-table rewriting this is
benign: a `dlopen` doesn't go through the load command / import table,
so rewriting them was a no-op anyway. ELF DT_NEEDED rewriting was
already gated on `aa.external_libs` pre-PR for the same reason.

The new behavior in this PR widens the gating to **ELF RPATH append**:
pre-PR, every audited artifact got `$ORIGIN/<dist>.libs` appended
unconditionally; post-PR, only those with non-empty `external_libs` do.
Practically: a Linux extension that statically depends on nothing but
`dlopen`s a library bundled by some other artifact in the same wheel
will now lack the RPATH and fail at runtime. Workaround: declare a
static dependency on at least one bundled lib (so the artifact gets the
auto-RPATH), or set the RPATH manually via
`cargo:rustc-link-arg=-Wl,-rpath,$ORIGIN/<dist>.libs`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot maturin build twice without making modifications, with external .so

2 participants