Draft: Add distribution tests to look for specific error modes #47
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
CHANGELOG.mdentrySummary
This introduces a file in distr_test to add tests focused on detecting cases where a distribution sampler is biased on a particular event.
Motivation
Some of the distribution sampling issues that I have found are hard to discover with a Kolmogorov-Smirnov or similar general distribution test, because the issues only occur with large input parameters or low probabilities, and would require a lot of time and memory to detect with a general test.
This is marked as draft for now, as the changes are still a bit rough.
Details
The last bit tests systematically confirm that the floating point precision issue for Binomial indeed only occurs at n > 2^53; and that the Geometric sampler has no equivalent issue.
test_binomial_endpointscan be used to confirm that the change in endpoint rounding for the BTPE method in #43 is reasonable -- currently the probability of 0 or 20 being sampled is ~50% higher than expected on Binomial(20, 0.5), while with #43 the event does not occur at a rate clearly distinguishable from the ideal probability.Since rand_distr reexports Bernouilli, it may as well have a distribution test for it.