Skip to content

Conversation

penelopeysm
Copy link
Member

Fixes #108

The problem was in this function split:

AdvancedPS.jl/src/rng.jl

Lines 33 to 41 in 379336b

"""
split(key::Integer, n::Integer=1)
Split `key` into `n` new keys
"""
function split(key::Integer, n::Integer=1)
T = typeof(key) # Make sure the type of `key` is consistent on W32 and W64 systems.
return T[hash(key, i) for i in UInt(1):UInt(n)]
end

Here, UInt(1) gives UInt64 on 64-bit systems., and UInt32 on 32-bit. Thus, the result of hash was different, despite ensuring that we convert back to the same type at the end.

The 'simple' solution of replacing UInt with UInt64 doesn't work, because on 32-bit systems, hash(::UInt64, ::UInt64) is not defined.

This PR therefore uses a pseudo-random number generator to deterministically generate new values of type T from the original value.

@penelopeysm penelopeysm force-pushed the py/random branch 2 times, most recently from 0de34e8 to 8dc147c Compare January 21, 2025 16:08
@coveralls
Copy link

coveralls commented Jan 21, 2025

Pull Request Test Coverage Report for Build 12891599012

Details

  • 3 of 3 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.5%) to 96.098%

Totals Coverage Status
Change from base Build 9928521956: 0.5%
Covered Lines: 394
Relevant Lines: 410

💛 - Coveralls

N_SAMPLES = 50
N_SAMPLES = 200
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to bump this to get the test to pass.

@penelopeysm
Copy link
Member Author

It's not possible to put in a test that checks that it's the same result across architectures (we could compare to a fixed value, but that would be very brittle). But this at least demonstrates that sampling a Turing model with PG() now gives the same results across architectures: https://github.com/penelopeysm/Shaymin.jl/actions/runs/12891706831

@penelopeysm penelopeysm requested review from yebai and mhauru January 21, 2025 16:49
Copy link
Member

@mhauru mhauru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work figuring this out! This has been bugging me, too, for a while now.

@yebai
Copy link
Member

yebai commented Jan 22, 2025

@FredericWantiez implemented this splittable pseudo-random number generator (PRNG). The splitting step requires some attention since not all splitting is valid. See 1.

There is a package for this splittable PRNGs, which we could depend on: https://github.com/Julia-Tempering/SplittableRandoms.jl

See, also: https://github.com/janestreet/splittable_random

Footnotes

  1. Steele, G. L., Lea, D., & Flood, C. H. (2014). Fast splittable pseudorandom number generators. Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 453–472.

@penelopeysm
Copy link
Member Author

penelopeysm commented Jan 22, 2025

Thank you, @yebai, I was reading this over lunch :)

For context, the RNG stream being used right now is Philox which is described here: https://www.thesalmons.org/john/random123/papers/random123sc11.pdf

I'm not really certain that our original implementation of split is correct either, though - is it? It seems to me that (unlike, e.g., Numpy, https://numpy.org/doc/2.1/reference/random/bit_generators/philox.html) the Random123.jl package doesn't include the necessary functionality to split a Philox RNG, and this split function seems to me to be quite ad hoc too as the algorithm isn't mentioned in the paper.

So it seems to me that this is at least no worse, and maybe we could rework the RNG system in another PR, perhaps? Let me know what you think.

@yebai
Copy link
Member

yebai commented Jan 22, 2025

Does SplittableRandoms.jl provide enough functionality for our purpose? If so, it's okay to depend on it -- it is a very lightweight package without any dependencies other than the Julia standard library.

@FredericWantiez
Copy link
Member

Thanks @penelopeysm for the fix. I don't think there was an implementation of split at the time, probably best to switch to SplittableRandoms.jl now

@penelopeysm
Copy link
Member Author

I'm in favour of switching over, and I think that it does have exactly the right functionality, but looking at AdvancedPS right now, it seems to me that that would require a fairly substantial reworking of the whole rng.jl file.

I'd prefer to merge this and open another issue for that (unless there are concerns over this PR itself, of course).

@FredericWantiez
Copy link
Member

We could also tweak split here to be somewhat closer to the jax implementation for example, but that's probably going to abuse the Random123 interface.
https://github.com/jax-ml/jax/blob/main/jax/_src/prng.py#L1114

@yebai
Copy link
Member

yebai commented Jan 23, 2025

@penelopeysm feel free to merge and open an issue for it.

@penelopeysm penelopeysm merged commit d0d180f into master Jan 23, 2025
18 checks passed
@penelopeysm penelopeysm deleted the py/random branch January 23, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Particle Gibbs reproducibly gives different results on x64 vs x86
5 participants