How does java.util.Random work and how good is it?
The java.util.Random class was once the standard Java random nubmer
generator. It still provides the blueprint for various different random number generation methods such as nextInt()
, nextDouble()
.
But in terms of the algorithm it uses, it has now been largely surpassed by ThreadLocalRandom for reasons that
we will explore below and on the pages referred to below.
The algorithm: a 48-bit LCG generator
The java.util.Random class implements what is generally called a
linear congruential generator (LCG). An LCG is essentially a
formula of the following form:
numberi+1 = (a * numberi + c) mod m
In other words, we begin with some start or "seed" number which ideally is "genuinely
unpredictable", and which in practice is "unpredictable enough".
For example, the number of milliseconds— or even nanoseconds—
since the computer was switched on is available on most systems. Then, each time we want a random
number, we multiply the current seed by some fixed number, a, add another fixed number,
c, then take the result modulo another fixed number, m. The number a is
generally large. This method of random number
generation goes back pretty much to the dawn of computing1. Pretty much every
"casual" random number generator you can think of—
from those of scientific calculators to 1980s home computers to currentday C and Visual Basic
library functions— uses some variant of the above formula to generate its random numbers.
Depending on the values chosen for a, c and and m, the quality of
random numbers produced by this method varies between "unbelievably disastrous" and
"OK for casual applications". For practical reasons, it is generally common to do one of the following:
- make a
close to a power of 2 (so that the multiplication can be performed by shifting and adding/subtracting2),
or;
- make m a power of 2, often the register size of the machine (such as 232 for
a 32-bit machine) so that the modulo is carried out either "for free", or at worst via an AND
operation rather than an expensive division3.
With or without these constraints, values for the parameters are then generally sought so that:
- every possibly value between 0 and m-1 inclusive is generated before the pattern
repeats itself;
- the numbers generated are as "statistically random" as we can get them (see below).
Since for a given "current seed" value, the "next seed" will always be completely
predictable based on that value, the series of numbers must repeat after at most m
generations. This is called the period of the random number generator. In
the case of java.util.Random, m is 248 and the other values
have indeed been chosen so that the generator has its maximum period. Therefore:
LCG parameters used by java.util.Random
The actual parameters used by java.util.Random are
essentially taken from the UNIX rand48 generator
(though with a slightly different seeding function). For
reasons discussed later, only the top 32 bits of each 48 bits
generated are used.
With these parameters, the resulting random number generator appears to
be about as "good as it gets" for an LCG.
The period of the java.util.Random generator is 248.
In decimal, 248 is a few hundred million million. That might sound like enough—
and it is for certain applications— but it does mean some quite severe limitations in
other cases. For example, consider an application where you pull out a number of 2-integer
pairs (and where you use the full range of the integer). One integer has 232
possible values. So the number of possible combinations of 2-integer pairs is 232 * 232,
or 264. In other words, java.util.Random will not be able to
produce every possible combination. Of course, even a generator that produced "perfect" random
numbers with a 248 period would have this limitation.
For some testing or scientific applications, that would be bad enough. But it turns out that
with LCGs, things are actually worse:
- When taking combinations of values, e.g. for coordinates, the resulting
pairs, triples etc always have a particular mathematical relationship, sometimes
described as "falling in the planes".
- Not all bits are produced with equal randomness: the lower the bit
in the number generated, the less "random" it actually is. This can introduce artefects
in some cases. See the section
on the randomness of bits with LCGs for an illustration
and guidelines on minimising this problem.
Improving on java.util.Random
Because of the limitations of the java.util.Random algorithm, various random number generators
have been added to the Java platform. The standard random number generators included in the Java platform
include:
- ThreadLocalRandom, a faster, higher quality random number generator that
is often the default choice nowadays;
- SplittableRandom, which allows multiple instances to work together to achieve the
effect of a single generator with a larger period;
- SecureRandom, a cryptographically secure random nubmer generator.
The
Random class is also designed to be pluggable: in other words, is also possible to
subclass
java.util.Random with a desired implementation and gain the benefits of the various
Random methos
such as
nextDouble() etc without having to implement these from scratch unless there is a reason to do so.
What to read next
The following pages contain more information about java.util.Random and other random nubmer generators in Java:
1. It is generally attributed to Dick Lehmer, who appears to have intoduced it
formally in a 1948 conference paper.
2. For example, 65539 is 216+3. So 65539x can be calculated as x<<16+x+x+x. Shift and addition/subtraction instructions are generally much faster operations than mutliplication.
3. For example, to calculate a number modulo 248— the modulus chosen in
java.util.Random, we AND it with (248-1).
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.