Skip to content

Conversation

@mstoeckl
Copy link
Contributor

@mstoeckl mstoeckl commented Dec 11, 2025

  • Added a CHANGELOG.md entry

Summary

Gamma::sample could return NaN due to multiplying 0.0 with something that had overflowed; this PR makes the sampler return 0.0 in such cases instead. This changes the order of some multiplications, which can slightly change the output of the sampler on typical inputs.

Motivation

This is another edge case that I found found with a parameter fuzzer (https://github.com/mstoeckl/rand_distr/commits/fuzz-params/), that turns out to be avoidable with only slight additional complexity.

Details

The specific NaN that I noticed came from multiplying u.powf(self.inv_shape), which for small inv_shape could be zero, with the output of GammaLargeShape::sample, which could overflow to +inf if scale was large. The general method to avoid this is to delay applying the distribution scale factor until the end of the calculation, and before that point try to operate on values which are close in magnitude to 1.

I also added a special case to Gamma::new to handle shape, scale = +inf in particular.

The parameter combinations which would previously produce Nan (shape very close to zero, scale very close to the max float value) will continue to produce somewhat inaccurate output (producing more zero output values than expected, because u.powf(self.inv_shape) is likely to underflow when shape is small) Fixing this particular edge case would likely require rewriting the GammaSmallShape sampler to use a different and possibly more expensive sampling method.

@mstoeckl mstoeckl force-pushed the gamma-avoid-nan branch 2 times, most recently from 82ec7a8 to 4aca53d Compare December 11, 2025 21:48
@benjamin-lieser
Copy link
Member

I think our general approach to deal with extreme parameters should be to just return an Error from new if we cannot accurately sample from it. These extreme parameters will often be the result of bugs on the userside anyway.
If these parameters are needed and it is feasible to sample from them we can add this later.

This changes the order of multiplications used to compute the result
to avoid multiplying zero with an expression that can overflow to +inf.

Note that the parameter combinations which could lead to this (shape
very close to zero, scale very close to the max float value) continue to
be inaccurately handled; the Gamma distribution sampler will now tend to
return zero instead of NaN for them. The limit (shape 0, scale inf) is
not well defined.
@mstoeckl
Copy link
Contributor Author

mstoeckl commented Dec 16, 2025

I think our general approach to deal with extreme parameters should be to just return an Error from new if we cannot accurately sample from it.

I'm not certain what a good threshold / measurement of accuracy would be for Gamma in particular, and for now have added a generic warning about inaccuracy to the documentation.

  • The issue depends on the precise floating point type -- underflow in u.powf(self.inv_shape) starts occurring as soon as shape < 1/7 or so for f32, while for f64 the threshold is, if my math is right, around 1/20. A majority of values underflowing to zero requires about 1/150 for f32, and 1/1000 for f64. The Kolmogorov-Smirnov test in distr_test , using f64, starts failing with (shape = 1/200, scale=1.0), around 3% underflow probability, although I am not sure why exactly.
  • What is "accurate" for floating point depends on what you use the distribution for -- some uses may work fine if on extreme inputs the mean and stddev are within additive epsilon, others if multiplicative, others may depend on tail behavior, or just that the value is nonnegative

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants