1. Trang chủ >
  2. Kinh Doanh - Tiếp Thị >
  3. Kế hoạch kinh doanh >

Chapter 4. Random Number Generation and Encryption

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (6.13 MB, 1,644 trang )


122



4. RANDOM NUMBER GENERATION AND ENCRYPTION



parametrized with four integers, as follows:

µ the modulus



µ>0



α



the multiplier



0≤α<µ



γ



the increment



0≤γ<µ



the starting value, or seed



0 ≤ x0 < µ



x0



If xn is the current value of the “random seed” then a call to the random number

generator first computes

(4.0.28)



xn+1 = (αxn + γ) mod µ



as the seed for the next call, and then returns xn+1 /µ as independent observation of

a pseudo random number which is uniformly distributed in (0, 1).

a mod b is the remainder in the integer division of a by b. For instance 13 mod 10 =

3, 16 mod 8 = 0, etc.

The selection of α, γ, and µ is critical here. We need the following criteria:

• The random generator should have a full period, i.e., it should produce all

numbers 0 < x < µ before repeating. (Once one number is repeated, the

whole cycle is repeated).

• The function should “appear random.”



4. RANDOM NUMBER GENERATION AND ENCRYPTION



123



• The function should implement efficiently with 32-bit arithmetic.

If µ is prime and γ = 0, then for certain values of α the period is µ − 1, with only the

value 0 missing. For 32-bit arithmetic, a convenient value of µ is 231 − 1, which is a

prime number. Of the more than 2 billion possible choices for α, only and handful

pass all 3 tests.

Problem 75. Convince yourself by some examples that for all a, b, and µ follows:

(4.0.29)



a · b mod µ = a · (b mod µ) mod µ



In view of of Question 75, the multiplicative congruential random generator is

based on the following procedure: generate two sequences of integers ai and bi as

follows: ai = x1 · αi and bi = i · µ. ai is multiplicative and bi additive, and µ is a

prime number which is not a factor of α. In other words, ai and bi have very little to

do with each other. Then for each i find ai − bs(i) where bs(i) is the largest b which is

a −b



smaller than or equal to ai , and then form form i µs(i) to get a number between 0

and 1. This is a measure of relationship between two processes which have very little

to do with each other, and therefore we should not be surprised if this interaction

turns out to look “random.” Knuth writes [Knu81, p. 10]: “taking the remainder

mod µ is somewhat like determining where a ball will land in a spinning roulette

wheel.” Of course, this is a heuristic argument. There is a lot of mathematical



124



4. RANDOM NUMBER GENERATION AND ENCRYPTION



theory behind the fact that linear congruential random number generators are good

generators.

If γ = 0 then the period is shorter: then the maximum period is µ − 1 because

any sequence which contains 0 has 0 everywhere. But not having to add γ at every

step makes computation easier.

Not all pairs α and µ give good random number generators, and one should only

use random number generators which have been thoroughly tested. There are some

examples of bad random number generators used in certain hardware or software

programs.

Problem 76. The dataset located at www.econ.utah.edu/ehrbar/data/randu.txt

(which is available as dataset randu in the R-base distribution) has 3 columns and

400 rows. Each row is a consecutive triple of numbers generated by the old VAX

FORTRAN function RANDU running under VMS 1.5. This random generator,

which is discussed in [Knu98, pp. 106/7], starts with an odd seed x0 , the n + 1st

seed is xn+1 = (65539xn ) mod 231 , and the data displayed are xn /231 rounded to 6

digits. Load the data into xgobi and use the Rotation view to check whether you

can see something suspicious.

Answer. All data are concentrated in 15 parallel planes. All triplets of observations of randu

fall into these planes; [Knu98, pp. ??] has a mathematical proof. VMS versions 2.0 and higher use

a different random generator.



4.1. ALTERNATIVES TO LINEAR CONGRUENTIAL



125



4.1. Alternatives to the Linear Congruential Random Generator

One of the common fallacies encountered in connection with random number

generation is the idea that we can take a good generator and modify it a little in

order to get an “even more random” sequence. This is often false.

Making the value dependent on the two preceding values increases the maximum

possible period to µ2 . The simplest such generator, the Fibonacci sequence

(4.1.1)



xn+1 = (xn + xn−1 ) mod µ



is definitely not satisfactorily random. But specific other combinations are good:

(4.1.2)



xn+1 = (xn−100 − xn−37 ) mod 230



is one of the state of the art random generators used in R.

Using more work to get from one number to the next, not mere addition or

multiplication:

(4.1.3)



xn+1 = (αx−1 + γ) mod µ

n



Efficient algorithms exist but are not in the repertoire of most computers. This

generator is completely free of the lattice structure of multiplicative congruential

generators.



126



4. RANDOM NUMBER GENERATION AND ENCRYPTION



Combine several random generators: If you have two random generators with

modulus m, use

(4.1.4)



xm − ym mod µ



The Wichmann-Hill portable random generator uses this trick.

Randomizing by shuffling: If you have xn and yn , put the first k observation

of xn into a buffer, call them v1 , . . . , vk (k = 100 or so). Then construct xn+1 and

yn+1 . Use yn+1 to generate a random integer j between 1 and k, use vj as your next

random observation, and put xn+1 in the buffer at place j. This still gives the same

values as xn but in a different order.

4.2. How to test random generators

Chi-Square Test: partition the outcomes into finitely many categories and test

whether the relative frequencies are compatible with the probabilities.

Kolmogorov-Smirnoff test for continuous distributions: uses the maximum distance between the empirical distribution function and the theoretical distribution

function.

Now there are 11 kinds of empirical tests, either on the original xn which are

supposedly uniform between 0 and 1, or on integer-valued yn between 0 and d-1.



4.2. HOW TO TEST RANDOM GENERATORS



127



Equidistribution: either a Chi-Square test that the outcomes fall into d intervals,

or a Kolmogoroff-Smirnov test.

Serial test: that all integer pairs in the integer-valued outcome are equally likely.

Gap test: for 0 ≤ α < β ≤ 1 a gap of length r is a sequence of r + 1 consecutive

numbers in which the last one is in the interval, and the others are not. Count the

occurrence of such gaps, and make a Chi Squared test with the probabilities of such

occurrences. For instance, if α = 0 and β = 1/2 this computes the lengths of “runs

above the mean.”

Poker test: consider groups of 5 successive integers and classify them into the

7 categories: all different, one pair, two pairs, three of a kind, full house, four of a

kind, five of a kind.

Coupon collectors test: observe the length of sequences required to get a full set

of integers 0, . . . , d − 1.

Permutation test: divide the input sequence of the continuous random variable

into t-element groups and look at all possible relative orderings of these k-tuples.

There are t! different relative orderings, and each ordering has probability 1/t!.

Run test: counts runs up, but don’t use Chi Square test since subsquent runs

are not independent; a long run up is likely to be followed by a short run up.



128



4. RANDOM NUMBER GENERATION AND ENCRYPTION



Maximum-of-t-Test: split the sample into batches of equal length and take the

maximum of each batch. Taking these maxima to the tth power should again give

an equidistributed sample.

Collision tests: 20 consecutive observations are all smaller than 1/2 with probability 2−20 ; and every other partition defined by combinations of bigger or smaller

than 1/2 has the same probability. If there are only 214 observations, then on the

average each of these partitions is populated only with probability 1/64. We count

the number of “collisions”, i.e., the number of partitions which have more than 1 observation in them, and compare this with the binomial distribution (the Chi Square

cannot be applied here).

Birthday spacings test: lagged Fibonacci generators consistently fail it.

Serial correlation test: a statistic which looks like a sample correlation coefficient

which can be easily computed with the Fast Fourier transformation.

Tests on subsequences: equally spaced subsequences are usually worse than the

original sequence if it is a linear congruential generator.

4.3. The Wichmann Hill generator

The Wichmann Hill generator defined in [WH82] can be implemented in almost

any high-level language. It used to be the default random number generator in R,

but version 1.0 of R has different defaults.



4.3. THE WICHMANN HILL GENERATOR



129



Since even the largest allowable integers in ordinary programming languages

are not large enough to make a good congruential random number generator, the

Wichmann Hill generator is the addition mod 1 of 3 different multiplicative congruential generators which can be computed using a high-level programming language. [Zei86] points out that due to the Chinese Remainder Theorem, see [Knu81,

p. 286], this is equivalent to one single multiplicative congruential generator with

α = 1655 54252 64690 and µ = 2781 71856 04309. Since such long integers cannot

be used in ordinary computer programs, Wichmann-Hill’s algorithm is an efficient

method to compute a congruential generator with such large numbers.



Problem 77. Here is a more detailed description of the Wichmann-Hill generator: Its seed is a 3-vector x1 y1 z1 satisfying



(4.3.1)



0 < x1 ≤ 30269



(4.3.2)



0 < y1 ≤ 30307



(4.3.3)



0 < z1 ≤ 30323



130



4. RANDOM NUMBER GENERATION AND ENCRYPTION



A call to the random generator updates the seed as follows:

(4.3.4)



x2 = 171x1 mod 30269



(4.3.5)



y2 = 172y1 mod 30307



(4.3.6)



z2 = 170z1 mod 30323



and then it returns

x2

y2

z2

+

+

mod 1

30269 30307 30323

as its latest drawing from a uniform distribution. If you have R on your computer,

do parts b and c, otherwise do a and b.

(4.3.7)



• a. 4 points Program the Wichmann-Hill random generator in the programming

language of your choice.

Answer. A random generator does two things:



• It takes the current seed (or generates one if there is none), computes the next seed from it, and

stores this next seed on disk as a side effect.

• Then it converts this next seed into a number between 0 and 1.

The ecmet package has two demonstration functions which perform these two tasks separately for

the Wichmann-Hill generator, without side effects. The function next.WHseed() computes the next

seed from its argument (which defaults to the seed stored in the official variable .Random.seed), and



4.3. THE WICHMANN HILL GENERATOR



131



the function WH.from.current.seed() gets a number between 0 and 1 from its argument (which

has the same default). Both functions are one-liners:

next.WHseed <- function(integer.seed = .Random.seed[-1])

(c( 171,

172,

170) * integer.seed) %% c(30269, 30307, 30323)

WH.from.current.seed <- function(integer.seed = .Random.seed[-1])

sum(integer.seed / c(30269, 30307, 30323)) %% 1



• b. 2 points Check that the 3 first numbers returned by the Wichmann-Hill

random number generator after setting the seed to 1 10 2000 are 0.2759128 0.8713303

0.6150737. (one digit in those 3 numbers is wrong; which is it, and what is the right

digit?)

Answer. The R-code doing this is ecmet.script(wichhill):

##This script generates 3 consecutive seeds, with the

##initial seed set as (1, 10, 2000), puts them into a matrix,

##and then generates the random numbers from the rows of

##this matrix:

my.seeds <- matrix(nrow=3, ncol=3)



132



4. RANDOM NUMBER GENERATION AND ENCRYPTION



my.seeds[1,] <- next.WHseed(c(1, 10, 2000))

my.seeds[2,] <- next.WHseed(my.seeds[1,])

my.seeds[3,] <- next.WHseed(my.seeds[2,])

my.unif <- c(WH.from.current.seed(my.seeds[1,]),

WH.from.current.seed(my.seeds[2,]),

WH.from.current.seed(my.seeds[3,]))



• c. 4 points Check that the Wichmann-Hill random generator built into R is

identical to the one described here.

Answer. First make sure that R will actually use the Wichmann-Hill generator (since it is not

the default): RNGkind("Wichmann-Hill"). Then call runif(1). (This sets a seed if there was none,

or uses the existing seed if there was one.) .Random.seed[-1] shows present value of the random

seed associated with this last call, dropping 1st number which indicates which random generator

this is for, which is not needed for our purposes. Therefore WH.from.current.seed(), which takes

.Random.seed[-1] as default argument, should give the same result as the last call of the official

random generator. And WH.from.current.seed(next.WHseed()) takes the current seed, computes

the next seed from it, and converts this next seed into a number between 0 and 1. It does not write

the updated random seed back. Therefore if we issue now the official call runif(1) again, we should

get the same result.



Xem Thêm
Tải bản đầy đủ (.pdf) (1,644 trang)

×