build: bump zig (LLVM IR attribute improvements)#29753
Conversation
|
Updated 9:54 AM PT - May 4th, 2026
❌ @Jarred-Sumner, your commit f3a7b87 has some failures in 🧪 To try this PR locally: bunx bun-pr 29753That installs a local version of the PR into your bun-29753 --bun |
There was a problem hiding this comment.
The diff is a one-line hash bump, but since this changes LLVM codegen attributes (notably dereferenceable and trunc nuw/nsw) across the entire Zig build and the Bun CI box is still unchecked, I'll leave this for a human to confirm once CI is green.
Extended reasoning...
Overview
This PR updates a single constant, ZIG_COMMIT in scripts/build/zig.ts, to pin a newer build of the oven-sh/zig fork. No other code is touched; the build script logic, download URL format, and stamp handling are unchanged.
Security risks
None from the diff itself — it swaps one commit hash for another in a release-download URL pointing at oven-sh's own zig fork. No new external sources, no shell-injection surface, no auth/crypto involvement.
Level of scrutiny
The textual change is trivial, but the semantic blast radius is the entire compiled binary: the new compiler emits dereferenceable(N) on by-ref params, nuw/nsw on @intCast truncations, and dead_on_unwind/writable on sret. These attributes license LLVM to perform speculative loads and more aggressive poison propagation, which can surface latent UB in Bun's Zig code that previously happened to work. This is exactly the class of change where the test suite — not code review of a hash — is the real gate.
Other factors
The PR's own test plan lists Bun CI as unchecked. Zig's behavior suite and cross-target builds passing is good signal for compiler correctness, but not for Bun-specific UB exposure. Given that, I'd rather a human glance at the Bun CI results before this lands than rubber-stamp a toolchain bump with codegen-semantics implications.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughPins a new Zig toolchain commit and adds non-inserting Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/allocators.zig`:
- Around line 597-599: The doc comment for the peek read-only lookup
(function/method named peek) incorrectly claims the cache-hit path is lock-free
while the implementation acquires self.mutex; change the wording to say peek
"avoids the caller’s coarse lock" and does not take the caller's coarse lock,
rather than asserting it is lock-free, so callers won't assume no mutex is
taken; update the comment near the peek definition in allocators.zig to reflect
this and remove the phrase "lock-free" while keeping the note about relying on
the map's fine-grained mutex.
In `@src/resolver/resolver.zig`:
- Around line 2727-2738: The fast-path returns a cached DirInfo via
r.dir_cache.atIndex(hit.index) without holding r.mutex while entries are
published before full initialization (DirInfo is put empty then populated
later), allowing other threads to observe a partially initialized DirInfo; fix
by removing the unsynchronized early return: when
r.dir_cache.peek(path_without_trailing_slash) yields a hit you must acquire
r.mutex (r.mutex.lock()) and re-check the cache/status under the lock (or
wait/loop until hit.status != .unknown) before returning the DirInfo, ensuring
the entry published by r.dir_cache.put(...) is fully initialized before being
returned.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e25b767e-9305-49a3-9959-3a41234cf6d0
📒 Files selected for processing (2)
src/allocators.zigsrc/resolver/resolver.zig
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/resolver/resolver.zig (1)
2312-2327:⚠️ Potential issue | 🟠 MajorSerialize this
reserve()/publish()slow path too.Unlike
dirInfoCachedMaybeLog(), this path never takesr.mutex. Two threads hitting the same uncacheddir_pathcan both observe.unknown, eachreserve()a different slot, and then race topublish(). The hash map ends up pointing at whichever publish ran last, while the other reservedDirInfoslot is leaked and can still be referenced by parent/child indices built during that thread’s walk.🔒 Suggested fix
const dir_path = strings.withoutTrailingSlashWindowsPath(dir_path_maybe_trail_slash); assertValidCacheKey(dir_path); + r.mutex.lock(); + defer r.mutex.unlock(); + var dir_cache_info_result = bun.handleOom(r.dir_cache.getOrPut(dir_path)); if (dir_cache_info_result.status == .exists) { // we've already looked up this package before return r.dir_cache.atIndex(dir_cache_info_result.index).?; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/resolver/resolver.zig` around lines 2312 - 2327, The uncached slow path calls r.dir_cache.reserve(...) and r.dir_cache.publish(...) without taking r.mutex, allowing two threads to reserve different DirInfo slots for the same dir_path and race on publish; serialize this path by acquiring the same lock used in dirInfoCachedMaybeLog (r.mutex) around the reserve/publish sequence (or re-check cache under the lock and only reserve if still unknown), ensure dirInfoUncached is still called with the reserved slot (dir_info_ptr) while holding or after re-verifying under the lock, and only call r.dir_cache.publish(dir_cache_info_result) while holding r.mutex to prevent leaks and races.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/resolver/resolver.zig`:
- Around line 2312-2327: The uncached slow path calls r.dir_cache.reserve(...)
and r.dir_cache.publish(...) without taking r.mutex, allowing two threads to
reserve different DirInfo slots for the same dir_path and race on publish;
serialize this path by acquiring the same lock used in dirInfoCachedMaybeLog
(r.mutex) around the reserve/publish sequence (or re-check cache under the lock
and only reserve if still unknown), ensure dirInfoUncached is still called with
the reserved slot (dir_info_ptr) while holding or after re-verifying under the
lock, and only call r.dir_cache.publish(dir_cache_info_result) while holding
r.mutex to prevent leaks and races.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2e720b6f-7e08-41ff-b5c4-60bb2739d424
📒 Files selected for processing (2)
src/allocators.zigsrc/resolver/resolver.zig
|
Re: |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/resolver/resolver.zig`:
- Around line 2312-2314: The reservation for DirInfo is being published on
error, exposing partially initialized state; change the error-path cleanup so
the reservation is aborted/rolled back instead of published: replace the
errdefer r.dir_cache.publish(dir_cache_info_result) with an errdefer
r.dir_cache.abort(dir_cache_info_result) (or remove errdefer and call
r.dir_cache.publish(dir_cache_info_result) only after dirInfoUncached and all
fallible work complete), ensuring r.dir_cache.reserve/dir_cache_info_result and
dirInfoUncached only publish on successful completion (also apply the same
change for the analogous reserve/publish at lines 3025-3026).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 75cf24f9-933a-415d-8aa2-005ab17fa2cd
📒 Files selected for processing (2)
src/allocators.zigsrc/resolver/resolver.zig
| const dir_info_ptr = r.dir_cache.reserve(&dir_cache_info_result, .{}) catch unreachable; | ||
| errdefer r.dir_cache.publish(dir_cache_info_result); | ||
|
|
There was a problem hiding this comment.
Abort failed reservations instead of publishing them.
dirInfoUncached() writes into DirInfo before later fallible work (notably the tsconfig extends walk). With errdefer r.dir_cache.publish(...), a failure after those mutations makes the slot visible as .exists, so later peek() / fast-path reads can return a partially initialized DirInfo and never retry the failed population. The error path needs a rollback/discard, not publish().
🛠️ Suggested direction
--- a/src/allocators.zig
+++ b/src/allocators.zig
@@
+ /// Drop a previously reserved entry so a later lookup can retry.
+ pub fn discard(self: *Self, result: Result) void {
+ self.mutex.lock();
+ defer self.mutex.unlock();
+ _ = self.index.remove(result.hash);
+ }- errdefer r.dir_cache.publish(dir_cache_info_result);
+ errdefer r.dir_cache.discard(dir_cache_info_result);
@@
- errdefer r.dir_cache.publish(queue_top.result);
+ errdefer r.dir_cache.discard(queue_top.result);Also applies to: 3025-3026
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/resolver/resolver.zig` around lines 2312 - 2314, The reservation for
DirInfo is being published on error, exposing partially initialized state;
change the error-path cleanup so the reservation is aborted/rolled back instead
of published: replace the errdefer r.dir_cache.publish(dir_cache_info_result)
with an errdefer r.dir_cache.abort(dir_cache_info_result) (or remove errdefer
and call r.dir_cache.publish(dir_cache_info_result) only after dirInfoUncached
and all fallible work complete), ensuring
r.dir_cache.reserve/dir_cache_info_result and dirInfoUncached only publish on
successful completion (also apply the same change for the analogous
reserve/publish at lines 3025-3026).
There was a problem hiding this comment.
Thanks — 1123c29 addresses both the errdefer publish leak and the double-lock in put(). I'm not auto-approving because this combines a toolchain-wide compiler bump with a new concurrency primitive (reserve/publish) on the resolver hot path, and the full test suite is still pending; worth a human sign-off.
Extended reasoning...
Overview
The PR bundles three things: (1) a ZIG_COMMIT bump in scripts/build/zig.ts that changes LLVM IR attribute emission for all Zig code in Bun, (2) new peek()/reserve()/reserveLocked()/publish() APIs on BSSMap in src/allocators.zig, and (3) a fast-path in dirInfoCachedMaybeLog (src/resolver/resolver.zig) that moves the r.mutex acquisition below a read-only peek(), plus migration of two write sites from put() to reserve→populate→publish.
Security risks
None identified. No auth, crypto, network, or untrusted-input parsing is touched. The concurrency change is a correctness/perf concern, not a security boundary.
Level of scrutiny
High. The compiler bump alone affects every Zig function's codegen and the description documents non-trivial behavior shifts (inlining thresholds, branch→cmov). The resolver change reorders lock acquisition in a multi-threaded hot path and introduces a publish-after-populate protocol that already required two follow-up fixes during review (562a29c for the half-init race, 1123c29 for the errdefer leak and double-lock). The happens-before edge now looks correct (writer's DirInfo stores → publish() mutex release → reader's peek() mutex acquire → atIndex() load), and the loop-scoped errdefer at resolver.zig:3026 correctly restores pre-PR semantics on the narrow tsconfig-overflow error path. But this is exactly the kind of change where a maintainer should confirm the threading model and weigh whether the resolver optimization belongs in the same PR as the compiler bump.
Other factors
- The PR description's test-plan checkbox for the full Bun test suite (build #48153) is still unchecked.
- The branch went through three substantive revisions in response to bot review; all flagged issues now appear addressed in the diff, and the bug-hunting pass on 1123c29 found nothing new.
src/resolver/resolver.zigis core tobun build/bun installand is not a low-risk file.- The author noted a pre-existing unsynchronized double-allocate in
dirInfoForResolutionas out-of-scope follow-up; that's reasonable but worth a maintainer ack.
Picks up oven-sh/zig#21 which adds: - dereferenceable(N) on *T pointer params and by-ref aggregate params - writable/dead_on_unwind/noundef/align on sret - trunc nuw/nsw and zext nneg for unchecked @intcast (replaces llvm.assume so functions stay memory(none)) - fneg fast under @setFloatMode(.optimized) - optsize for @branchHint(.cold) functions - index-counted @Memset loop for non-byte elements - 16-byte over-align for large anonymous constants
With dereferenceable(N) on *Printer, LLVM speculates the early-return guards in these helpers and drops them below the inline threshold, inlining ~42 copies into printStmt. Each copy drags in the ensureCapacity + memcpy slow path that is dead under --minify but bloats icache on the serial print path. three.js x10 --minify: 134.5ms -> ~129ms (back to baseline) while keeping the -6-9% user-CPU win from the compiler change.
This reverts commit 5c21d35.
The coarse `resolver_Mutex` is held for the entirety of `dirInfoCachedMaybeLog`, including the common case where the directory is already in `dir_cache`. With faster parallel parse workers this became the dominant contention point (futex wait 0.83s -> 2.04s on three.js x10). `dir_cache` (BSSMap) already has its own fine-grained mutex, so a read-only `peek()` is safe without the coarse lock. Check for a resolved entry first and only acquire `r.mutex` for the slow path that walks the directory tree and populates the cache. The existing `getOrPut()` re-check after acquiring the lock handles the race where another thread populated the entry in between. The slow path keeps the coarse lock per the existing comment about React-admin performance.
The fast path could observe a half-initialized DirInfo: dir_cache.put() made the hash->index mapping visible before dirInfoUncached() filled in the entry, so a concurrent peek() saw status=.exists and returned the empty struct. Split BSSMap.put into reserve (allocate index + storage, leave hash map at Unassigned) and publish (update hash map). Call sites now reserve, populate via dirInfoUncached, then publish. peek() sees .unknown until publish, so it falls through to the slow path which blocks on r.mutex.
- Add errdefer publish() at both reserve sites so an error from dirInfoUncached doesn't leak the reserved slot in an Unassigned state (which would make every retry leak another slot). - Keep BSSMap.put() to a single critical section by sharing reserveLocked() with reserve(), avoiding the extra lock cycle for callers that don't need the split.
f0fb207 to
f3a7b87
Compare
Bumps the Zig compiler to pick up oven-sh/zig#21, plus a resolver lock-contention fix to absorb the speedup.
What changed in the compiler
Zig's LLVM backend now emits several attributes/flags that Clang emits for equivalent C/C++ but Zig was omitting:
dereferenceable(N)on*Tparams and by-ref aggregate params → enables speculative loads, branch-to-select, loop vectorizationwritable+dead_on_unwind+noundef+alignon sret → call-slot copy eliminationtrunc nuw/nsw/zext nnegfor@intCastinstead ofllvm.assume→ functions containing@intCaststaymemory(none)fastflag onfnegunder@setFloatMode(.optimized)optsizefor@branchHint(.cold)functions@memsetloop for non-byte elements (SCEV-friendly)What changed in Bun
The compiler bump alone made parallel parse workers ~8% faster, which exposed contention on
resolver_Mutex(futex wait 0.83s → 2.04s) and turned into a +5% wall-clock regression onbun build. Fixed by adding a lock-free fast path todirInfoCached: a newBSSMap.peek()returns cache hits without taking the resolver's coarse lock, withreserve()/publish()ensuring entries are only visible once fully populated.Benchmarks (hyperfine, 25 runs, default MT)
--minifyrequire.resolve×180kbun outdatedbun install(cached)bun install(fresh)Run-to-run variance on three.js dropped 8.49% → 0.56% — the resolver mutex was a significant source of timing jitter.
Assembly comparison (linux-x64-baseline-profile)
cmov*call memcpyPer-function:
install.FolderResolution.getOrPut−52.3%,JSTranspiler.scanImports−25.7%,bundler.ParseTask.getAST−13.6%,MultiArrayList(InputFile).get−40.5% (stack frame 248→56 B).Test plan
build-zigjobs green across all targets