Skip to content

fereidani/xutex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

26 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Xutex β€” High‑Performance Hybrid Mutex

Crates.io Documentation License: MIT

Xutex is a high-performance mutex that seamlessly bridges synchronous and asynchronous Rust code with a single type and unified internal representation. Designed for extremely low-latency lock acquisition under minimal contention, it achieves near-zero overhead on the fast path while remaining runtime-agnostic.

Key Features

  • ⚑ Blazing-fast async performance: Up to 50Γ— faster than standard sync mutexes in single-threaded async runtimes, and 3–5Γ— faster in multi-threaded async runtime under extreme contention.
  • πŸ”„ Hybrid API: Use the same lock in both sync and async contexts
  • ⚑ 8-byte lock state: Single AtomicPtr on 64-bit platforms (guarded data stored separately)
  • πŸš€ Zero-allocation fast path: Lock acquisition requires no heap allocation when uncontended
  • ♻️ Smart allocation reuse: Object pooling minimizes allocations under contention
  • 🎯 Runtime-agnostic: Works with Tokio, async-std, monoio, or any executor using std::task::Waker
  • πŸ”’ Lock-free fast path: Single CAS operation for uncontended acquisition
  • πŸ“¦ Minimal footprint: Compact state representation with lazy queue allocation
  • πŸ›‘οΈ No-std compatible: Fully compatible with no_std environments, relying only on core and alloc

Installation

[dependencies]
xutex = "0.2"

Or via cargo:

# with std
cargo add xutex
# for no-std environments
cargo add xutex --no-default-features

Quick Start

Synchronous Usage

#[cfg(feature = "std")]
fn example() {
  use xutex::Mutex;

  let mutex = Mutex::new(0);
  {
    let mut guard = mutex.lock();
    *guard += 1;
  } // automatically unlocked on drop
  assert_eq!(*mutex.lock(), 1);
}

Asynchronous Usage

use xutex::AsyncMutex;

async fn increment(mutex: &AsyncMutex<i32>) {
    let mut guard = mutex.lock().await;
    *guard += 1;
}

Hybrid Usage

Convert seamlessly between sync and async:

#[cfg(feature = "std")]
use xutex::Mutex;
#[cfg(feature = "std")]
async fn example(mutex: &Mutex<i32>) {
  let async_ref = mutex.as_async();
  let guard = async_ref.lock().await;
}
#[cfg(feature = "std")]
fn example(){
  use xutex::{Mutex, AsyncMutex};
  // Async β†’ Sync
  let async_mutex = AsyncMutex::new(5);
  let sync_ref: &Mutex<_> = async_mutex.as_sync();
  let guard = sync_ref.lock();
  drop(guard);
  // Block on async mutex from sync context
  let guard = async_mutex.lock_sync();
}

Performance Characteristics

Why It's Fast

  1. Atomic state machine: Three states encoded in a single pointer:

    • UNLOCKED (null): Lock is free
    • LOCKED (sentinel): Lock held, no waiters
    • UPDATING: Queue initialization in progress
    • QUEUE_PTR: Lock held with waiting tasks/threads
  2. Lock-free fast path: Uncontended acquisition uses a single compare_exchange

  3. Lazy queue allocation: Wait queue created only when contention occurs

  4. Pointer tagging: LSB tagging prevents race conditions during queue modifications

  5. Stack-allocated waiters: Signal nodes live on the stack, forming an intrusive linked list

  6. Optimized memory ordering: Careful use of Acquire/Release semantics

  7. Adaptive backoff: Exponential backoff reduces cache thrashing under contention

  8. Minimal heap allocation: At most one allocation per contended lock via pooled queue reuse, additional waiters require zero allocations

Benchmarks

Run benchmarks on your machine:

cargo bench

Expected Performance (varies by hardware):

  • Uncontended: ~1-3ns per lock/unlock cycle (single CAS operation)
  • High contention: 2-3Γ— faster than tokio::sync::Mutex in async contexts
  • Sync contexts: Performance comparable to std::sync::Mutex with minimal overhead from queue pointer checks under high contention; matches parking_lot performance in low-contention scenarios

Design Deep Dive

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Mutex<T> / AsyncMutex<T>                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ MutexInternal<T>                      β”‚  β”‚
β”‚  β”‚  β€’ queue: AtomicPtr<QueueStructure>   β”‚  β”‚
β”‚  β”‚  β€’ inner: UnsafeCell<T>               β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β”œβ”€ UNLOCKED (null) ──────────────► Lock available
         β”‚
         β”œβ”€ LOCKED (sentinel) ─────────────► Lock held, no waiters
         β”‚
         └─ Queue pointer ─────────────────► Lock held, waiters queued
                   β”‚
                   β–Ό
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚  SignalQueue    β”‚
            β”‚  (linked list)  β”‚
            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                   β”‚
                   β–Ό
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚    Signal       │────►│    Signal       │────► ...
            β”‚  β€’ waker        β”‚     β”‚  β€’ waker        β”‚
            β”‚  β€’ value        β”‚     β”‚  β€’ value        β”‚
            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Signal States

Each waiter tracks its state through atomic transitions:

  1. SIGNAL_UNINIT (0): Initial state
  2. SIGNAL_INIT_WAITING (1): Enqueued and waiting
  3. SIGNAL_SIGNALED (2): Lock granted
  4. SIGNAL_RETURNED (!0): Guard has been returned

Thread Safety

  • Public API: 100% safe Rust
  • Internal implementation: Carefully controlled unsafe blocks for:
    • Queue manipulation (pointer tagging prevents use-after-free)
    • Guard creation (guaranteed by state machine)
    • Memory ordering (documented and audited)

API Reference

Mutex<T>

Method Description
new(data: T) Create a new synchronous mutex
lock() Acquire the lock (blocks current thread)
try_lock() Attempt non-blocking acquisition
lock_async() Acquire asynchronously (returns Future)
as_async() View as &AsyncMutex<T>
to_async() Convert to AsyncMutex<T>
to_async_arc() Convert Arc<Mutex<T>> to Arc<AsyncMutex<T>>

AsyncMutex<T>

Method Description
new(data: T) Create a new asynchronous mutex
lock() Acquire the lock (returns Future)
try_lock() Attempt non-blocking acquisition
lock_sync() Acquire synchronously (blocks current thread)
as_sync() View as &Mutex<T>
to_sync() Convert to Mutex<T>
to_sync_arc() Convert Arc<AsyncMutex<T>> to Arc<Mutex<T>>

MutexGuard<'a, T>

Implements Deref<Target = T> and DerefMut for transparent access to the protected data. Automatically releases the lock on drop.

Use Cases

βœ… Ideal For

  • High-frequency, low-contention async locks
  • Hybrid applications mixing sync and async code
  • Performance-critical sections with short critical regions
  • Runtime-agnostic async libraries
  • Situations requiring zero-allocation fast paths

⚠️ Not Ideal For

  • Predominantly synchronous workloads: In pure sync environments without async interaction, std::sync::Mutex may offer slightly better performance due to lower abstraction overhead
  • Read-heavy workloads: If your use case involves frequent reads with infrequent writes, consider using RwLock implementations (e.g., std::sync::RwLock or tokio::sync::RwLock) that allow multiple concurrent readers
  • Mutex poison state: Cases where std::sync::Mutex poisoning semantics are required

Caveats

  • 8-byte claim: Refers to lock metadata only on 64-bit platforms; guarded data T stored separately
  • No poisoning: Unlike std::sync::Mutex, panics don't poison the lock
  • Sync overhead: Slight performance cost vs std::sync::Mutex in pure-sync scenarios (~1-5%)

Testing

Run the test suite:

# Standard tests
cargo test

# With Miri (undefined behavior detection)
cargo +nightly miri test

# Benchmarks
cargo bench

TODO

  • Implement RwLock variant with shared/exclusive locking
  • Explore lock-free linked list implementation for improved wait queue performance

Contributing

Contributions are welcome! Please:

  1. Run cargo +nightly fmt and cargo clippy before submitting
  2. Add tests for new functionality
  3. Update documentation as needed
  4. Verify cargo miri test passes
  5. Note: This library is no-std compatible; use core and alloc instead of std. Ensure cargo test and cargo test --no-default-features run without warnings.

License

Licensed under the MIT License.


Author: Khashayar Fereidani
Repository: github.com/fereidani/xutex

About

an extremely fast async mutex with alternative sync API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages