XORShift random number generators

The aim of strong encryption or cryptographic hash functions is to take some input and turn it into an essentially random-looking output, with no discernible correlation between the input and resulting output. Indeed, cryptographically-strong random number generators (such as the Java SecureRandom class) build on this technique to produce good-quality random numbers1. The essential reason that such generators aren't always used is performance: generating random numbers using a cryptographic function is much slower than a simple technique such as the java.lang.Random linear congruential generator.

Encryption and hashing algorithms actually use very simple bitwise operations, such as XOR, addition, bit shifts and rotations. It's just they perform a very large number of such operations each time2. In many cases, we can use far fewer operations, but still achieve an acceptable degree of randomness. The problem is finding some small combination of bitwise operations that we can show satisfies our "randomness" requirements, such as having a given period and generating numbers that pass certain statistical tests.

Luckily, a rather elegant solution was found in 2003 by George Marsaglia in the form of XORShift Random Number Generators. Marsaglia showed that a generator involving three shifts and three XOR operations generates a "full" period (see below) and that the resulting values satisfy various statistical tests of randomness (including ones that poorer generators such as LCGs fail). Starting with some non-zero seed value of x— e.g. from System.nanoTime()— the algorithm looks as follows:

public long randomLong() {
  x ^= (x << 21);
  x ^= (x >>> 35);
  x ^= (x << 4);
  return x;

The method produces what we might describe as "medium quality" random numbers: certainly better than the LCG algorithm used by java.util.Random. (In particular, any bits of the resulting numbers can be assumed to be equally random. Contrast this with the randomness of LCG generators such as java.lang.Random, where how random a bit is depends on its position.) And what is powerful is that it does so using low-cost operations: shifts and XORs, with only a single word of state (x in the code above). In the light of this method, it's probably fair to say that LCGs used in isolation are a little bit of a waste of space. If the XORShift method had been discovered when java.lang.Random was first developed, the Java library authors may well have chosen it over LCG3.

The "magic" values of 21, 35 and 4 have been found to produce good results. With these values, the generator has a full period of 264-1, and the resulting values pass Marsaglia's "Diehard battery" of statistical tests for randomness4. L'Ecuyer & Simard (2007)5 also found that values of 13, 7 and 17 fail on only 7 of the 160 tests of their BigCrush battery (for comparison, java.util.Random fails on 21, while crypographic generators such as Java's SecureRandom— as indeed we might expect— generally pass all tests).

The technique can also be used with smaller seeds. In general, for an n-bit seed, we can find appropriate shift values to produce the full sequence with period 2n-1. In particular, this means that we can use the technique with 32-bit integers, and Marsaglia lists appropriate shift values. However, L'Ecuyer & Simard found that the 32-bit version gives poor results in their statistical tests. It's also probably not worth using a generator with such a small period unless either (a) performance is really tight, (b) you deliberately want a short period for some reason, or (c) you are combining the generator.

(The XORShift generator can be ported to a number of languages, and another motivation for using the 32-bit variant might be if you don't have 64-bit unsigned arithmetic conveniently available. This on-line solitaire game uses an XORShift generator (actually a combination of 2 differently seeded generators with different shift constants) in Javascript.

Subclassing Random with an XORShift generator

One saving grace of the java.lang.Random class is that you can easily subclass it and substitute your own source of random bits. Although the XORShift algorithm is simple enough that you could "inline" it manually in many cases, we can also create a subclass of Random that uses a XORShift generator.

Problems and advantages with XORShift generators

The XORShift algorithm is a vast improvement on LCG generators. However, it does still have at least a couple of weaknesses:

But we've said from the outset that XORShift is a technique for generating medium-quality fast and simply (it takes a few instructions and its only state is the seed, meaning, for example, that it's simple to inline). Sometimes that's the tradeoff that you want, and the numbers generated still have some good properties:

Needless to say, XORShift on its own is not cryptographically secure: it is relatively trivial for somebody observing a couple of numbers generated to deduce (at worst by trial and error) the shift values used, and hence predict future numbers.

1. As a very simplistic example, imagine that we take successive integers 0, 1, 2 etc, and compute a 160-bit cryptographic hash of those integers. We would expect the resulting sequence of hashes to look essentially random— otherwise, our hash function would have a security flaw!
2. In fact, they generally also use substitution: taking some partial value and using it as an index into an array of "random-looking" values in order to add additional unpredictability.
3. As a matter of interest, an LCG generator has found its way into the JDK libraries: Doug Lea's implementation of ConcurrentSkipListMap (introduced in Java 6) uses one.
4. In fact, various other combinations of shift amounts also pass these tests, and the shifts can even be reversed. The values used here are ones that the Numerical Recipes authors suggest "may be slightly superior" to other recommended values (3rd ed, p. 346).
5. L'Ecuyer, P. & Simard, R. (2007) TestU01: A C Library for Empirical Testing of Random Number Generators, ACM Transactions in Mathematical Software 33(4).