You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The libp2p rust bridge runs on its own thread (rustBridgeThread). onReqRespResponse (handling blocks received from peers) fires on that thread and calls directly into processBlockByRootChunk → chain.onBlock() → forkChoice.onBlock() (exclusive mutex.lock()) + forkChoice.updateHead() (exclusive lock).
Meanwhile the main libxev event loop calls forkChoice.onInterval() (also exclusive mutex). During block sync when many blocks are incoming these exclusive locks contend and can stall both sync and the main tick.
Serving side (onReqRespRequest) is mostly fine:
blocks_by_root: DB-only, no forkchoice lock ✅
status: two brief lockShared() calls — OK but blocked if a writer holds the exclusive lock
Serving status: Cache an atomically-updated Status snapshot (updated in onBlockFollowup whenever head/finalized changes). The status req/resp handler then serves it lock-free.
General: Any read from forkchoice for serving req/resp should use a snapshot — take the lock briefly, copy the needed fields, release, then do the work.
Problem
The libp2p rust bridge runs on its own thread (
rustBridgeThread).onReqRespResponse(handling blocks received from peers) fires on that thread and calls directly intoprocessBlockByRootChunk→chain.onBlock()→forkChoice.onBlock()(exclusivemutex.lock()) +forkChoice.updateHead()(exclusive lock).Meanwhile the main libxev event loop calls
forkChoice.onInterval()(also exclusive mutex). During block sync when many blocks are incoming these exclusive locks contend and can stall both sync and the main tick.Serving side (
onReqRespRequest) is mostly fine:blocks_by_root: DB-only, no forkchoice lock ✅status: two brieflockShared()calls — OK but blocked if a writer holds the exclusive lockFix
Response handling:
onReqRespResponseshould not callchain.onBlock()directly on the libp2p thread. Received blocks should be queued into a channel/ring buffer and consumed by the main event loop. This is exactly the architecture described in Follow-up: Replace state_mutex with xev.Async lock-free dispatch for Rust→Zig network callbacks #700 (xev.Asyncdispatch).Serving status: Cache an atomically-updated
Statussnapshot (updated inonBlockFollowupwhenever head/finalized changes). Thestatusreq/resp handler then serves it lock-free.General: Any read from forkchoice for serving req/resp should use a snapshot — take the lock briefly, copy the needed fields, release, then do the work.
Related
xev.Asynclock-free dispatch (eliminates the need for any lock here)state_mutexas interim fix for the same class of problem