fix(sdist): handle parent workspaces and refactor sdist generation#3055
Conversation
b9c7efd to
94711d4
Compare
There was a problem hiding this comment.
Pull request overview
Fixes a panic during maturin sdist when a path dependency’s workspace manifest lives outside the computed sdist root (e.g., excluded crate depending on a sibling workspace member), by avoiding an unconditional strip_prefix(...).unwrap() and adding logic to inline workspace-inherited Cargo.toml fields when the parent workspace manifest can’t be included.
Changes:
- Skip adding a dependency workspace
Cargo.tomlwhen it’s outside the sdist root (avoidStripPrefixErrorpanic). - Add workspace-inheritance inlining using
cargo metadatafor path dependencies whose workspace manifest is outside the sdist. - Add a regression test and a new fixture workspace to reproduce the parent-workspace path dependency scenario.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/source_distribution.rs |
Avoid panic when workspace manifest is outside sdist root; adds workspace-inheritance inlining and refactors Cargo.toml rewriting helpers. |
tests/run.rs |
Adds regression test for parent-workspace path dependency sdist scenario. |
test-crates/parent_workspace_sdist/** |
New test workspace + crates to reproduce the issue. |
Comments suppressed due to low confidence (3)
src/source_distribution.rs:98
- PR description says sdists with
*.workspace = truewill still fail when the workspace manifest is outside the sdist, but the code now attempts to inline workspace-inherited fields viaresolved_package. Either update the PR description/release notes to reflect the new behavior, or scope the implementation back to “avoid panic only”.
/// Resolved package metadata from `cargo metadata`, used to inline
/// workspace-inherited fields when the workspace manifest is outside the sdist root.
resolved_package: Option<cargo_metadata::Package>,
src/source_distribution.rs:275
resolve_workspace_inheritanceintroduces fairly intricate rewriting of workspace-inherited package fields andworkspace = truedependencies, but there are no unit tests covering the rewritten TOML output (especially around dependency renames and non-registry workspace deps). Adding targeted tests here would help prevent regressions and catch cases like alias/package-name handling.
/// Inlines workspace-inherited fields in a path dependency's `Cargo.toml`
/// using resolved values from `cargo metadata`.
///
/// This is needed when the dependency's workspace manifest falls outside the
/// sdist root (e.g. a crate excluded from a parent workspace that depends on
/// sibling workspace members). Without this, `field.workspace = true` entries
/// would fail to resolve when building from the sdist.
fn resolve_workspace_inheritance(document: &mut DocumentMut, resolved: &cargo_metadata::Package) {
// Resolve `[package]` fields that support `field.workspace = true`
if let Some(package) = document.get_mut("package").and_then(|p| p.as_table_mut()) {
let version_str = resolved.version.to_string();
let edition_str = resolved.edition.to_string();
let rust_version_str = resolved.rust_version.as_ref().map(|v| v.to_string());
let string_fields: &[(&str, Option<&str>)] = &[
("version", Some(&version_str)),
("edition", Some(&edition_str)),
("description", resolved.description.as_deref()),
("license", resolved.license.as_deref()),
("repository", resolved.repository.as_deref()),
("homepage", resolved.homepage.as_deref()),
("documentation", resolved.documentation.as_deref()),
("rust-version", rust_version_str.as_deref()),
];
tests/run.rs:913
- This regression test only asserts the sdist file list; it doesn’t validate that the generated
shared_crate/Cargo.tomlis actually buildable (e.g.,edition.workspace = trueshould be inlined toedition = "2021"when the parent workspace manifest isn’t included). Consider using theexpected_cargo_tomlhook to assert the rewritten manifest contents for at least the inherited fields.
/// Test sdist with path dependency from parent workspace.
/// Reproduces the scenario where a crate is excluded from a parent workspace
/// but depends on sibling crates that ARE in the parent workspace.
/// Before the fix, this would panic with StripPrefixError.
#[test]
fn lib_with_parent_workspace_path_dep_sdist() {
handle_result(other::test_source_distribution(
"test-crates/parent_workspace_sdist/crates/pysof",
SdistGenerator::Cargo,
expect![[r#"
{
"pysof-0.1.0/PKG-INFO",
"pysof-0.1.0/pyproject.toml",
"pysof-0.1.0/pysof/.gitignore",
"pysof-0.1.0/pysof/Cargo.lock",
"pysof-0.1.0/pysof/Cargo.toml",
"pysof-0.1.0/pysof/src/lib.rs",
"pysof-0.1.0/shared_crate/Cargo.toml",
"pysof-0.1.0/shared_crate/src/lib.rs",
}
"#]],
None,
"sdist-lib-with-parent-workspace-path-dep",
))
}
52d74c6 to
5891d00
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (4)
src/source_distribution.rs:434
- When
targetisNone, this can still match a target-specific dependency entry fromcargo metadata(i.e. whered.target.is_some()), depending on iteration order. That can cause the base[dependencies]workspace entries to be resolved using the wrong target cfg. Consider adding an explicit branch: iftarget.is_none(), required.target.is_none()(and similarly keep the existing equality check whentarget.is_some()).
fn find_resolved_dep(
resolved: &cargo_metadata::Package,
name: &str,
target: Option<&str>,
) -> Option<ResolvedDep> {
resolved.dependencies.iter().find_map(|d| {
// Match by name, considering renames
let matches_name = d.rename.as_deref() == Some(name) || d.name == name;
if !matches_name {
return None;
}
// If a target is specified, match against it
if let Some(target_str) = target
&& d.target.as_ref().map(|t| t.to_string()).as_deref() != Some(target_str)
{
return None;
}
src/source_distribution.rs:1787
- On Windows (or any platform with path prefixes), if
common_path_prefix(from, to)returnsNone(e.g. different drive letters / prefixes),to_restcan remain absolute;result.push(to_rest)then produces an absolute path. Emitting an absolutepath = \"...\"into rewrittenCargo.tomlis usually undesirable for sdists and can lead to non-reproducible builds. Consider detecting the 'no common prefix' case and returning an error (change return type toResult<PathBuf>) or otherwise handling differing prefixes explicitly (e.g. refuse to rewrite to a relative path when impossible).
fn relative_path(from: &Path, to: &Path) -> PathBuf {
let common = common_path_prefix(from, to).unwrap_or_default();
let from_rest = from.strip_prefix(&common).unwrap_or(Path::new(""));
let to_rest = to.strip_prefix(&common).unwrap_or(to);
let mut result = PathBuf::new();
for _ in from_rest.components() {
result.push("..");
}
result.push(to_rest);
result
}
src/source_distribution.rs:1860
- This unit test hard-codes Unix-style absolute paths (e.g.
/a/b/c) and expectations using/separators. If CI runs tests on Windows, these cases are likely to fail or behave differently due to prefixes/drive letters and separator handling. Consider making the test platform-aware (e.g.#[cfg(unix)]+ a separate Windows test), or construct paths usingPathBufcomponents and compare usingPathBufvalues appropriate for the platform.
#[test]
fn test_relative_path() {
let test_cases = vec![
("/a/b/c", "/a/b/d", "../d"),
("/a/b", "/a/b/c/d", "c/d"),
("/a/b/c", "/a/b", ".."),
("/a/b/c", "/a/x/y", "../../x/y"),
("/a/b", "/a/b", ""),
];
for (from, to, expected) in test_cases {
assert_eq!(
relative_path(Path::new(from), Path::new(to)),
PathBuf::from(expected),
"relative_path({from:?}, {to:?}) should equal {expected:?}",
);
}
}
src/source_distribution.rs:260
- The new
resolve_workspace_inheritancebehavior is central to the fix, but there are no direct unit tests asserting thatfield.workspace = trueentries (e.g.edition.workspace = true) are actually rewritten to concrete values, and thatdep = { workspace = true }entries are replaced with the expected TOML. Adding a focused test that feeds a smallDocumentMutcontaining workspace-inherited fields and verifies the rewritten output would make regressions much easier to catch.
/// Inlines workspace-inherited fields in a path dependency's `Cargo.toml`
/// using resolved values from `cargo metadata`.
///
/// This is needed when the dependency's workspace manifest falls outside the
/// sdist root (e.g. a crate excluded from a parent workspace that depends on
/// sibling workspace members). Without this, `field.workspace = true` entries
/// would fail to resolve when building from the sdist.
fn resolve_workspace_inheritance(document: &mut DocumentMut, resolved: &cargo_metadata::Package) {
d727f4c to
2b45143
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 11 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (2)
src/source_distribution.rs:1846
test_relative_pathhard-codes POSIX-style absolute paths and expected strings with/separators. On Windows,PathBufcomparisons are separator-sensitive andrelative_pathwill likely produce..\\d-style paths, so this test can fail depending on platform. Prefer constructing inputs/expected values usingPathBuf/Path::join(or compare normalized slash strings) to make the test platform-independent.
fn test_relative_path() {
let test_cases = vec![
("/a/b/c", "/a/b/d", "../d"),
("/a/b", "/a/b/c/d", "c/d"),
("/a/b/c", "/a/b", ".."),
("/a/b/c", "/a/x/y", "../../x/y"),
("/a/b", "/a/b", ""),
];
tests/run.rs:903
- The comment says "
readmeis removed" but the expectation below assertsreadme = "README.md"is present. It looks like the intent is that theworkspace = truemarker is removed and the path rewritten, not that thereadmefield is removed entirely.
// shared_crate uses `edition.workspace = true` and `readme.workspace = true`
// from the parent workspace. Since the parent workspace Cargo.toml is NOT
// included in the sdist, these must be inlined to their resolved values.
// `readme` is removed (the file is copied separately and the path rewritten),
// and `edition` is inlined to "2021".
When building a source distribution for a crate that is excluded from its parent workspace but depends on sibling crates in the parent workspace, maturin would panic with `StripPrefixError`. This occurred because the code assumed that a path dependency's workspace manifest would always be inside the sdist root, using `.unwrap()` on the result of `strip_prefix()`. Replace `.unwrap()` with `if let Ok()` to gracefully skip workspace manifests outside the sdist boundary. Note that path dependencies using workspace inheritance (e.g. `edition.workspace = true`) will still fail to build from the resulting sdist since the parent workspace manifest is not included. Closes PyO3#2766
- Merge `rewrite_cargo_toml_readme` and `rewrite_cargo_toml_license_file` into a single `rewrite_cargo_toml_package_field` function - Remove `resolve_and_add_readme` wrapper, use `resolve_and_add_file` directly - Simplify target dep handling in `resolve_workspace_inheritance` by using `get_mut` once instead of double traversal (read then mutate) - Use `.map().transpose()` and hoist shared `target_dir` computation for readme/license-file resolution in both main crate and path deps - Fix `!...is_ok()` to `.is_err()` in `add_path_dep` - Use `.with_context()` instead of `.context(format!())` in `parse_toml_file`
- Fix `package` field: when a dependency is renamed, emit `package = "<real_crate_name>"` (from `d.name`) instead of incorrectly emitting `package = "<alias>"` (from `d.rename`) - Preserve `path` for workspace-inherited path dependencies instead of emitting only `version`, which would lose source information and change resolution. Compute relative path from the dependent crate to the dependency directory using a new `relative_path` helper. - Add unit tests for `relative_path`, path dep TOML output, and renamed dep TOML output
When `resolve_workspace_inheritance` ran after `rewrite_cargo_toml_package_field`, it would overwrite the rewritten filename-only path with the full path from cargo metadata. Now `resolve_workspace_inheritance` just removes the `workspace = true` marker for `readme`/`license-file`, deferring to the caller which copies the file and rewrites the path. Add `readme.workspace = true` to the `parent_workspace_sdist` test fixture and verify that `shared_crate/Cargo.toml` in the sdist has both `edition` inlined and `readme` rewritten to just the filename.
Move source_distribution.rs to source_distribution/mod.rs to enable splitting into focused submodules in subsequent commits. No code changes.
Move is_compiled_artifact, normalize_path, common_path_prefix, and relative_path into source_distribution/utils.rs with their tests. These are pure functions with no sdist-specific coupling.
Move unpack_sdist() into its own submodule. This function is unrelated to building sdists (used only for unpacking before a build-from-sdist) and benefits from isolation.
…ions Move all Cargo.toml and pyproject.toml rewriting functions into a dedicated submodule (~580 lines): - parse_toml_file - rewrite_cargo_toml (workspace.members pruning) - rewrite_cargo_toml_targets (strip missing build targets) - rewrite_cargo_toml_package_field (readme/license-file path fix) - rewrite_pyproject_toml (manifest-path/python-source fix) - resolve_workspace_inheritance and helpers (inline workspace=true) - ResolvedDep, resolved_dep_to_toml Tests for these functions move with them.
Move PathDependency struct and find_path_deps() into a dedicated submodule. This isolates the cargo metadata graph traversal from the sdist assembly logic, making both easier to reason about. The test for workspace license-file resolution moves with it.
…e ManifestAsset Major restructuring of the sdist assembly logic: - Introduce ManifestAsset struct to cleanly separate resolution of readme/license-file paths from writing them to the sdist. This replaces the old resolve_and_add_file pattern that tangled validation with I/O. - Replace CrateRole enum with AddCrateOptions struct — flat flags instead of two-variant dispatch. Clearer what each call site needs. - Extract cargo_package_file_list() helper — isolates the subprocess call to `cargo package --list` for testability and reuse. - Break the 340-line add_cargo_package_files_to_sdist into focused phase helpers: * compute_sdist_root() — common ancestor calculation * compute_project_root() — outermost project root * add_main_crate() — root crate files + readme/license * add_cargo_lock() — Cargo.lock handling * add_workspace_manifest() — workspace Cargo.toml rewriting * add_pyproject_toml() — pyproject.toml rewriting * add_python_sources() — python source file collection The orchestrator is now a clear 6-step sequence.
- Fix stale comment referencing deleted resolve_and_add_file - Make ResolvedDep and resolved_dep_to_toml private (only used within cargo_toml_rewrite.rs) - Remove redundant outer .with_context on add_path_dep calls - Hoist path_dep_workspace_manifest to avoid computing it twice - Remove redundant skip_prefixes.is_empty() guard (.any() on empty iterator already returns false) - Remove duplicate test that exercised same code path as the pure resolve_manifest_asset test; simplify remaining test setup
…dule Move three pyproject.toml-related functions from mod.rs into a new source_distribution::pyproject submodule: - add_pyproject_toml: rewriting and adding pyproject.toml to the sdist - add_python_sources: adding python source packages - add_pyproject_metadata: adding readme, license, and include patterns This brings mod.rs from 1069 to 896 lines and gives pyproject.toml concerns a dedicated home alongside the existing submodules.
…rite_pyproject_toml
Three abstraction improvements from review:
1. SdistContext now holds a reference to BuildContext and provides a
new() constructor that encapsulates all path computation (sdist_root,
abs_manifest_dir, project_root, etc.). This eliminates build_context
as a separate parameter from every phase function — add_main_crate,
add_cargo_lock, add_workspace_manifest, add_pyproject_toml, and
add_python_sources all now take just (writer, ctx).
The orchestrator becomes a clean 6-phase sequence:
let ctx = SdistContext::new(build_context, pyproject_toml_path, root_dir)?;
add_path_dep / add_main_crate / add_cargo_lock / ...
2. Moved rewrite_pyproject_toml from cargo_toml_rewrite.rs to pyproject.rs.
It rewrites pyproject.toml, not Cargo.toml — now the module name matches
the content. cargo_toml_rewrite.rs is exclusively about Cargo.toml.
3. pyproject.rs is now the single home for all pyproject.toml concerns:
reading, rewriting paths, adding metadata (readme/license/includes),
and adding python source files.
2b45143 to
58919a5
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 17 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (3)
tests/run.rs:903
- The comment says "
readmeis removed", but this test’s expected Cargo.toml output includesreadme = "README.md". Please update the comment to match the actual behavior (i.e.,readmeis rewritten to a concrete path/filename rather than removed).
// shared_crate uses `edition.workspace = true` and `readme.workspace = true`
// from the parent workspace. Since the parent workspace Cargo.toml is NOT
// included in the sdist, these must be inlined to their resolved values.
// `readme` is removed (the file is copied separately and the path rewritten),
// and `edition` is inlined to "2021".
src/source_distribution/utils.rs:76
- If
common_path_prefix(from, to)returnsNone(e.g., Windows paths on different drive letters),commonbecomes empty, andto_restcan remain absolute. In that caseresult.push(to_rest)will discard the accumulated..components and produce an absolute path, which is not a valid relative path. Handle theNonecase explicitly (e.g., returnto.to_path_buf()or otherwise signal “not computable”), and add a test that covers the “no common prefix” scenario.
pub(super) fn relative_path(from: &Path, to: &Path) -> PathBuf {
let common = common_path_prefix(from, to).unwrap_or_default();
let from_rest = from.strip_prefix(&common).unwrap_or(Path::new(""));
let to_rest = to.strip_prefix(&common).unwrap_or(to);
let mut result = PathBuf::new();
for _ in from_rest.components() {
result.push("..");
}
result.push(to_rest);
result
}
src/source_distribution/path_deps.rs:127
resolved_packageis cloned and stored for every discovered path dependency, but it’s only used later when the dependency’s workspace manifest falls outside the sdist root. Consider avoiding the unconditional clone by storing a lightweight reference key (e.g.,PackageId) and looking up the package incargo_metadata.packageswhen/if inlining is required, or only cloning when it’s definitively needed.
// The root cargo metadata already resolves all package fields
// (including workspace-inherited ones) for every package in the
// dependency graph, regardless of which workspace they belong to.
path_deps.insert(
dep_name.clone(),
PathDependency {
manifest_path: dep_manifest_path.into_std_path_buf(),
workspace_root: dep_workspace_root,
readme: dep_pkg
.readme
.as_ref()
.map(|r| r.clone().into_std_path_buf()),
license_file: dep_pkg
.license_file
.as_ref()
.map(|l| l.clone().into_std_path_buf()),
resolved_package: Some((*dep_pkg).clone()),
},
);
find_path_deps previously spawned a separate `cargo metadata` subprocess for every path dependency to discover its workspace root and resolved package data. For the common case where path deps are in the same workspace as the root crate, this is unnecessary — the root `cargo metadata` already contains: - workspace_root (known: it's the same workspace) - fully resolved Package fields (version, deps, license, etc.) - readme and license-file paths Now only cross-workspace path deps (rare) trigger a separate `cargo metadata` call, and only to discover their workspace root. Also removes the intermediate pkg_readmes/pkg_license_files index maps — these just duplicated data already available on the Package struct we look up via packages_by_id.
…ing wrong spec find_resolved_dep previously matched only by name/target, ignoring DependencyKind. When a crate declares the same dep in multiple tables (e.g. [dependencies] and [dev-dependencies]) with different settings, the wrong entry could be returned. Add a kind parameter to find_resolved_dep and pass the appropriate DependencyKind (derived from the dep_kind table name) from both resolve_workspace_deps and the target-specific loop. Also add a workspace dev-dependency (log) to the test crate to cover the dev-dependencies path, and update pyo3 to 0.27.0 to match other test crates.
58919a5 to
a2ddf59
Compare
1. tests/run.rs: Fix misleading comment — readme is rewritten to the filename (not removed) when workspace inheritance is inlined. 2. utils.rs: Handle no-common-prefix case in relative_path explicitly (e.g. different Windows drive letters) by returning the absolute target path. Add test coverage for this edge case. 3. path_deps.rs: Only clone resolved_package for cross-workspace deps. Same-workspace deps never need it (their workspace manifest is always inside the sdist root), avoiding an unnecessary deep clone of cargo_metadata::Package per dep.
- Match path dependencies using NodeDep dep_kinds/targets plus rename key to avoid ambiguous name-only lookups - Deduplicate by PackageId traversal (not dependency alias) so distinct crates that reuse the same alias are both included and traversed - Keep map keys stable for common cases, and disambiguate collisions with a name+package-id suffix when aliases collide - Add regression test for two distinct transitive path deps sharing alias under different parent crates
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 18 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (4)
tests/run.rs:896
- The PR description notes that path deps using workspace inheritance will still fail to build from the sdist, but this test (and the new inlining logic) asserts that
*.workspace = truefields are successfully inlined when the parent workspace manifest is outside the sdist root. Please align the PR description with the implemented behavior (or clarify the remaining unsupported cases).
/// Also verifies that workspace-inherited fields (`edition.workspace = true`,
/// `readme.workspace = true`) are correctly inlined in the path dependency's
/// Cargo.toml when the parent workspace manifest is outside the sdist root.
src/source_distribution/mod.rs:9
ModuleWriteris imported but never used in this module, which will trigger an unused-import warning (and likely fail CI if warnings are denied). Remove the import here (and keep the usage viacrate::module_writer::...where needed).
use crate::{BuildContext, ModuleWriter, PyProjectToml, SDistWriter, VirtualWriter};
use anyhow::{Context, Result, bail};
src/source_distribution/pyproject.rs:3
ModuleWriteris imported but never used in this file, which will cause an unused-import warning. Remove it from theuse crate::{...}list.
use crate::pyproject_toml::Format;
use crate::{ModuleWriter, PyProjectToml, SDistWriter, VirtualWriter};
use anyhow::{Context, Result};
src/source_distribution/utils.rs:71
- The doc comment for
relative_pathsays both inputs “should be absolute”, but the function is also used/tests cover relative inputs (e.g."foo"). Update the documentation to reflect the actual supported inputs/behavior (or enforce absoluteness in code/tests).
/// Compute a relative path from `from` directory to `to` path.
///
/// Both paths should be absolute. Returns a relative path that, when joined
/// with `from`, resolves to `to`. If the paths share no common prefix
/// (e.g. different Windows drive letters), the absolute `to` path is returned
/// as-is since no relative traversal is possible.
pub(super) fn relative_path(from: &Path, to: &Path) -> PathBuf {
let Some(common) = common_path_prefix(from, to) else {
// No common prefix — fall back to the absolute target path.
return to.to_path_buf();
sdist
This PR fixes and refactors
sdistgeneration, with a focus on path dependencies that come from parent workspaces.Fixes
When building a source distribution for a crate that is excluded from its parent workspace but depends on sibling crates in the parent workspace, maturin could panic or produce an sdist that failed to build.
This PR fixes those cases by:
[lints] workspace = true)package.build = ...entries when the referenced build script was excluded from the sdistRefactoring / simplification
This branch also includes a broader cleanup of the
sdistimplementation:source_distribution.rsinto focused submodulescargo metadatafor target discovery and source paths instead of reimplementing Cargo's target inference rulespublishvalues)Result
The generated sdist is now more robust for excluded crates that depend on crates from a parent workspace, and the implementation is smaller and closer to Cargo's own model.
Closes #2766