Make reads thread-safe by not reusing hasher by default#43
Merged
tylertreat merged 2 commits intotylertreat:masterfrom Nov 13, 2025
Merged
Make reads thread-safe by not reusing hasher by default#43tylertreat merged 2 commits intotylertreat:masterfrom
tylertreat merged 2 commits intotylertreat:masterfrom
Conversation
Contributor
dimitarvdimitrov
left a comment
There was a problem hiding this comment.
Alternately we can have a sync pool with hashers instead of allocating one on each call. Will that simplify the code?
Contributor
Author
|
I added some benchmarks to the pull request description.
@dimitarvdimitrov In this PR we're not allocating anything, just a local (stack) uint64 to hold the hash. A pool is pure overhead in this case for sharing objects that shouldn't exist to begin with. I think avoiding this overhead is worth the extra code of copying the FNV functions over. |
Owner
|
Thanks! |
This was referenced Nov 14, 2025
Merged
tcard
added a commit
to grafana/mimir
that referenced
this pull request
Nov 18, 2025
<!-- Thanks for sending a pull request! Before submitting: 1. Read our CONTRIBUTING.md guide 2. Rebase your PR if it gets out of sync with main --> #### What this PR does Upgrades BoomFilters to apply this fix: * tylertreat/BoomFilters#43 #### Which issue(s) this PR fixes or relates to — #### Checklist - [ ] Tests updated. - [ ] Documentation added. - [ ] `CHANGELOG.md` updated - the order of entries should be `[CHANGE]`, `[FEATURE]`, `[ENHANCEMENT]`, `[BUGFIX]`. If changelog entry is not needed, please add the `changelog-not-needed` label to the PR. - [ ] [`about-versioning.md`](https://github.com/grafana/mimir/blob/main/docs/sources/mimir/configure/about-versioning.md) updated with experimental features. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Bumps `github.com/tylertreat/BoomFilters` and updates vendored code to use internal FNV helper hashing across filters, removing legacy FNV usages and minor vendor files. > > - **Dependencies**: > - Upgrade `github.com/tylertreat/BoomFilters` to `v0.0.0-20251117164519-53813c36cc1b` in `go.mod`/`go.sum` and `vendor/modules.txt`. > - **Vendored library (`vendor/github.com/tylertreat/BoomFilters`)**: > - Add `fnv.go` with internal FNV helper functions and refactor hashing to use these in `boom.go`, `cuckoo.go`, `hyperloglog.go`, `inverse.go`, `partitioned.go`, `stable.go`, `countmin.go`, `classic.go`, `counting.go`. > - Adjust constructors/decoders to stop defaulting to `fnv` in several filters and use new helpers; switch `InverseBloomFilter` indexing to optional pooled hash or default helper. > - Remove `.travis.yml`; simplify `README.md` badges. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 33bc6a1. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We're seeing data race conditions when calling
CountMinSketch.Count:The race is on the
*sum64receiver of thehash/fnv.Writemethod, and happens because the library stores the*sum64, wrapped in ahash.Hash64, instead of callingfnv.New64(andfnv.New32) afresh every time it needs to hash something. Concurrent hash operations will thus read-write from the same*sum64. This isn't thread-safe: hash operations will step on each other.To avoid this, initially I changed the code to call
fnv.New64for each hash operation. This, however, has a noticeable impact on performance, as a heap allocation needs to happen each time.But really, this allocation doesn't need to exist to begin with:
sum64is just auint64, and it can be stack-allocated and passed around as one would with any word-sized value. It's really an unfortunate consequence of the design ofhashthat this isn't available out of the box, and we're forced to go through interfaces just to move uint64s around.So instead, the (small) bits of code needed to implement FNV have been copied into plain, interface-less functions, and called by default, unless a custom
hash.Hash32/64is provided.Benchmarks:
With direct calls (pull request)
With indirect calls through new hasher per operation (discarded option)