Seeding random number generators
In our discussion of random number generators so far, we've seen the generator's
period as a major limiting factor in "how much randomness" it can generate.
But an equally important— and sometimes more important— issue is
how we seed or initialise a random
number generator.
Let's take, for example, Java's standard generator java.util.Random,
which has a period of 248. We can imagine this as a huge wheel with 248
randomly-numbered notches. Whenever we create an instance of java.util.Random,
the wheel starts in a "random" place, and moves round by one notch every time we generate a
number from that instance. Wherever we start from, we'll end up back at the same place after
generating 248 numbers. If the place where we start is "truly random", then
a sequence length of 248 may be sufficient for our application.
But how do we pick a random place on the wheel to start from?
The random place that we start from is in effect the seed or
initial state of the generator. For this, we ideally want to pick a number (or some
sequence of bits) that is "truly unpredictable". Or put another way, we want to
find some source of entropy (or "true unpredictability") available
to the program.
System clocks and System.nanoTime()
The traditional solution used by java.lang.Random
is to use one of the system clocks that the computer makes available. In recent
versions of the JVM, the measure taken is that reported by System.nanoTime(),
which typically gives the number of nanoseconds since the computer was switched on.
(In older versions of the JVM, or possibly as a fallback position on some platforms,
System.currentTimeMillis() is used, which reports the current "wall clock" time
in milliseconds since 1 Jan 1970.)
So how good is System.nanoTime()? Well, on the face of it, it's a 64-bit
value, so ample for generating starting points for all 248 possible sequences.
But now consider that there are "only" 109 nanoseconds every second. So in the first
five minutes that the computer is switched on, "only" 3x1011— or about
238— possible values could be returned by System.nanoTime().
In other words, in the first 5 minutes of the computer being on, only about one thousandth of
the possible series of java.lang.Random will ever be generated. (Of course, if
Windows takes three of those five minutes to boot up, and/or if the system cannot actually report timings
with nanosecond granularity, that reduces the number of possibilities
even further...) For many casual applications, 238 possibilities or something in that order
is still plenty.
But we certainly wouldn't want to use System.nanoTime() to seed, say, a
random encryption key, or a "serious" game or gambling application where a user's ability to
guess the random sequence could result in financial loss.
Of course, for such
applications, we also shouldn't be using java.lang.Random. But the problem of
seed selection still applies. Even if we have a high-quality random number generator
with a period of, say, 2160, that period doesn't really buy us "extra randomness"
if we are only able to generate, say, 238 distinct seeds and/or an adversary can make
further predictions about the seed.
Looking for other sources of entropy
To get more randomness in the choice of the generator's "initial position", we need
to look for more sources of entropy on the local machine. On the next
page, we discuss approaches to finding entropy for
seeding a random number generator.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.