Skip to content

Improve bucketing of native histograms with positive schema #1916

@wyfo

Description

@wyfo

Following #1915, I've also started thinking about implementing native histograms in the Rust client. Looking at the current Go implementation:

var (
key int
schema = atomic.LoadInt32(&hc.nativeHistogramSchema)
zeroThreshold = math.Float64frombits(atomic.LoadUint64(&hc.nativeHistogramZeroThresholdBits))
bucketCreated, isInf bool
)
if math.IsInf(v, 0) {
// Pretend v is MaxFloat64 but later increment key by one.
if math.IsInf(v, +1) {
v = math.MaxFloat64
} else {
v = -math.MaxFloat64
}
isInf = true
}
frac, exp := math.Frexp(math.Abs(v))
if schema > 0 {
bounds := nativeHistogramBounds[schema]
key = sort.SearchFloat64s(bounds, frac) + (exp-1)*len(bounds)
} else {
key = exp
if frac == 0.5 {
key--
}
offset := (1 << -schema) - 1
key = (key + offset) >> -schema
}
if isInf {
key++
}
switch {
case v > zeroThreshold:
bucketCreated = addToBucket(&hc.nativeHistogramBucketsPositive, key, 1)
case v < -zeroThreshold:
bucketCreated = addToBucket(&hc.nativeHistogramBucketsNegative, key, 1)
default:
atomic.AddUint64(&hc.nativeHistogramZeroBucket, 1)
}

I've spotted several things to improve:

  • Inf bucket can be computed directly, avoiding an extra if isInf
  • bucket for positive schema can be computed with:
    _, exp2 := math.Frexp(math.Pow(frac, float64(int(1)<<schema))); exp2 + (exp << schema)
  • if frac == 0.5 can be avoided by using math.Frexp(math.Nextafter(math.Abs(v), math.Inf(-1))) (zero threshold must be checked first)
  • a single math.Abs(v) > zeroThreshold can replace the two tests (since math.Abs(v) is already computed); then nativeHistogramBucketsPositive, nativeHistogramBucketsNegative could become [2]sync.Map, indexed by math.Signbit(v)
  • math.Frexp, as well as math.Nextafter can be replaced by a simplified version that skips special cases (NaN, Inf, etc.), since they’re already filtered out

The most important optimization is bucket computation — avoiding binary search, which gets slow as arrays grow (especially schema 8).
I’ve done experiments in Rust: https://github.com/wyfo/native-histogram (benchmark results can be found here)

I will try to submit a PR — hopefully next week.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions