Introduction to Random Numbers

Introduction to the Random Numbers

By Herbert J. Bernstein
© 2003 Herbert J. Bernstein

What are Random Numbers

When a process results in events each of which has no connection with the others, we call that a random process. That does not mean that the process need have no structure. If there is a number associated with each event, even though there is no apparent connection between any two numbers, the overall distribution may fit some pattern. When we take more and more numbers, they might form a bell-shaped curve (a Gaussian distribution) or fill some interval fairly evenly (a uniform distribution) or have some other shape. We call the average expected value of the distribution the mean and we measure how widely the values are scattered by the variance (the expected value of the squares of the deviations from the mean). The square root of the variance is called the standard deviation.

When dealing with random processes, we usually have to estimate its properties experimentally. We may not know the actual mean value (μ) of the distribution, but we can estimate it as the average value of some set of sample values drawn from the distribution to get an estimated mean. Similarly we can compute an estimated standard deviation to use in place of the actual standard deviation (σ):

Estimated Mean m = (Σx_i)/N
Estimated Standard Deviation s = ((Σ(x_i-M)²)/N)^.5

There are many tests we can apply to see of the numbers we obtain from a process are truly random. For more information see D. E. Knuth, "The Art of Computer Programming, Volume 2/ Seminumerical Algorithms," Addison Wesley, Reading, MA, 1969, and "Random Number Testing and Generation" at http://csrc.nist.gov/rng/

Why are Random Numbers Needed in Computer Programs

Sampling
Simulation
Testing
Solving problems to within a given probability of correctness
- Monte Carlo methods (see Random Number Generators Copyright (C) 1991, 1992, 1993, 1994, 1995 by the Computational Science Education Project )
- Prime testing (see Hal Abelson, Jerry Sussman, and Julie Sussman "Structure and Interpretation of Computer Programs" (MIT Press, 1984; ISBN 0-262-01077-1))
BUT truly random numbers mean irreproducible results

Pseudorandom Numbers

Number generated by a deterministic process that seems random
Often from linear congruence (D. H. Lehmer 1948, Proc. 2nd Symposium on Large Scale Digital Computing Machinery, Harvard U. Press, Cambridge 1951, pp 141-146)
X_n+1 = (A*X_n +B) mod M
Good generators have a very long period
Can show strong correlations (non-randomness)
Most programming languages have a built-in generator (e.g. java Math.random)

Timing Software

Timing can be exhaustive search or random sampling
Random test cases usually disclose anomalies, but not always (e.g. quicksort).
Timing of random case generation needs to be timed
Java does not have good timing hooks

Prepared by Herbert J. Bernstein yaya@bernstein-plus-sons.com, 6 November 2003