-
Notifications
You must be signed in to change notification settings - Fork 4
Description
The current StableHasher
implementation unconditionally uses SipHash13, which is the algorithm of choice for incremental compilation fingerprinting in the Rust compiler. However, for other use cases different hash functions with stronger guarantees, such as BLAKE3, are often preferable.
StableHasher
could therefore be generic of the hash algorithm used, e.g.:
/// Trait for retrieving the result of the stable hashing operation.
pub trait StableHasherResult<const W: usize>: Sized {
fn finish(hash: [u64; W]) -> Self;
}
/// The hash algorithm used
pub trait HashImpl<const W: usize>: Hasher + Debug + Default {
fn short_write<const LEN: usize>(&mut self, bytes: [u8; LEN]);
fn finish(self) -> [u64; W];
}
/// When hashing something that ends up affecting properties like symbol names,
/// we want these symbol names to be calculated independently of other factors
/// like what architecture you're compiling *from*.
///
/// To that end we always convert integers to little-endian format before
/// hashing and the architecture dependent `isize` and `usize` types are
/// extended to 64 bits if needed.
#[derive(Debug)]
pub struct StableHasher<const W: usize, H: HashImpl<W>> {
state: H,
}
See here for a quick and dirty POC impl: https://github.com/michaelwoerister/rustc-stable-hash/tree/generic_hash
I've been using hash: [u64; W]
just for simplicity. It might make more sense to make that hash: [u8; W]
to get rid of any endianess ambiguities.
Now that StableHasher
is available outside the compiler, it makes sense to provide different hash algorithms and to make it clear that SipHash13
is not a generically good choice.