Seeding random number generators

In our discussion of random number generators so far, we've seen the generator's period as a major limiting factor in "how much randomness" it can generate. But an equally important— and sometimes more important— issue is how we seed or initialise a random number generator.

Let's take, for example, Java's standard generator java.util.Random, which has a period of 248. We can imagine this as a huge wheel with 248 randomly-numbered notches. Whenever we create an instance of java.util.Random, the wheel starts in a "random" place, and moves round by one notch every time we generate a number from that instance. Wherever we start from, we'll end up back at the same place after generating 248 numbers. If the place where we start is "truly random", then a sequence length of 248 may be sufficient for our application. But how do we pick a random place on the wheel to start from?

The random place that we start from is in effect the seed or initial state of the generator. For this, we ideally want to pick a number (or some sequence of bits) that is "truly unpredictable". Or put another way, we want to find some source of entropy (or "true unpredictability") available to the program.

System clocks and System.nanoTime()

The traditional solution used by java.lang.Random is to use one of the system clocks that the computer makes available. In recent versions of the JVM, the measure taken is that reported by System.nanoTime(), which typically gives the number of nanoseconds since the computer was switched on. (In older versions of the JVM, or possibly as a fallback position on some platforms, System.currentTimeMillis() is used, which reports the current "wall clock" time in milliseconds since 1 Jan 1970.)

So how good is System.nanoTime()? Well, on the face of it, it's a 64-bit value, so ample for generating starting points for all 248 possible sequences. But now consider that there are "only" 109 nanoseconds every second. So in the first five minutes that the computer is switched on, "only" 3x1011— or about 238— possible values could be returned by System.nanoTime(). In other words, in the first 5 minutes of the computer being on, only about one thousandth of the possible series of java.lang.Random will ever be generated. (Of course, if Windows takes three of those five minutes to boot up, and/or if the system cannot actually report timings with nanosecond granularity, that reduces the number of possibilities even further...) For many casual applications, 238 possibilities or something in that order is still plenty. But we certainly wouldn't want to use System.nanoTime() to seed, say, a random encryption key, or a "serious" game or gambling application where a user's ability to guess the random sequence could result in financial loss.

Of course, for such applications, we also shouldn't be using java.lang.Random. But the problem of seed selection still applies. Even if we have a high-quality random number generator with a period of, say, 2160, that period doesn't really buy us "extra randomness" if we are only able to generate, say, 238 distinct seeds and/or an adversary can make further predictions about the seed.

Looking for other sources of entropy

To get more randomness in the choice of the generator's "initial position", we need to look for more sources of entropy on the local machine. On the next page, we discuss approaches to finding entropy for seeding a random number generator.

If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.

Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.