SplittableRandom
SplittableRandom is a special random number generation algorithm that allows multiple instances
with a short period to work togther to give the statistical properties of a single generator with a larger period.
It is designed in particular for multithreaded algorithms that consume a huge number of random numbers. Each thread
need only update a single 64-bit word of state per 64 random bits generated. Normally, a random number generator with a longer period
would need to update more words of state, with an associated performance cost, and coordinating such a generator between multiple
threads would involve more complex synchronization.
Background
There are various applications such as simulations or Monte Carlo tree search algorithms that need to generaate
a large number of random numbers on each trial. They may also be run for a large number of trials, so that overall, a huge number of random
numbers need to be generated: so huge, that the 264 period of a generator such as ThreadLocalRandom
becomes an issue. At the same time, we would also like to parallelise such algorithms to exploit today's multicore/multinode architecture.
Normally, we could turn to a random number generator with a longer period. For example, the Numerical Recipes authors suggest
a combined generator with a period of ~2191. But a generator with a longer period implies
that more state must be updated each time a number is generated. If multiple threads share the same instance, this must be
updated in a thread-safe manner. And even if each thread has its own instance, we would still prefer random number generation to be as lightweight
as possible in an application that per se is going to consume a huge number of random numbers.
At the same time, the type of algorithm we are
talking about is generally a "divide and conquer" algorithm that splits itself naturally into mutliple subtasks. Any individual subtask,
such as running a simulation on one single node in a search tree, only needs to consume a relatively small number of random numbers:
small enough that a 264 period would suffice for that specific node. But if we literally took multiple instances of the
same generator, they you ultimately be consumer random numbers from different sections of the same underlying sequence. Given enough
instances and/or random numbers generated, there would be a change that some of the sequences could actually coincide— a phenomenon
sometimes referred to as seed collision.
What we would ideally like is a way for each individual node
to be able to use a generator with a 264 period (and hence, which only involved updating 64 bits of state for each 64-bit
number generated). But at the same time, we would like those generators to have a combined effect no different to having a single
generator with a larger period, so that the different sequences used by different subtasks do not collide.
This is where SplittableRandom comes in. It allows us to take a "base" instance, kick off a divide-and-conquer algorithm,
and have that base instance "split" hierarchically into child instances whenever new subtasks are spawned.
Each child instance has just 64 bits of state to update on each generation
and can generate nubmers more efficiently. But each is consumer numbers from its own effectively unique sequence, with
a combined effect that is not1 measurably different from having
a single random number generator with a period closer to 2127.
SplittableRandom: how it works
To understand how SplittableRandom works, we must first consider how ThreadLocalRandom works.
ThreadLocalRandom takes a seed sequence and applies a hash to each successive seed to generate a sequence of random
numbers; to form the sequence, the seed is incremented by a fixed gamma value on each iteration. By default, an "ideal" gamme
is chosen that is an irrational number, i.e. one with no particular pattern to its bits.
SplittableRandom works by assigning a different gamma to each "split". The gamma is essentially random (but for a given split, it is
fixed once assigned). We cannot therefore enforce our ideal condition of having a gamma with a "pattern free" bit pattern. Instead, the
SplittableRandom algorithm— dubbed SplitMix— introduces a compromise. Whenever an instance is split, a new
random gamma is allocated, but a simple bitwise function is applied that attempts to iron out "worst case" instances of gamme values consisting
of a large number of repeated 1s. (A large number of repeating 0s is tolerated, on the other hand.)
1. Just how closely this goal is achieved is a matter of ongoing research.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.