Creating an even distribution of INTEGERS in C

Question

Creating an even distribution of INTEGERS in C

I wrote a C function that I think selects integers from a uniform distribution with a range of [rangeLow, rangeHigh], inclusive. This is not homework - I just use it on some embedded systems that use me for fun.

In my test cases, this code creates an appropriate distribution. I am not sure if the implementation is correct. Can someone do a sanity check and let me know if I did something wrong here?

//uniform_distribution returns an INTEGER in [rangeLow, rangeHigh], inclusive. int uniform_distribution(int rangeLow, int rangeHigh) { int myRand = (int)rand(); int range = rangeHigh - rangeLow + 1; //+1 makes it [rangeLow, rangeHigh], inclusive. int myRand_scaled = (myRand % range) + rangeLow; return myRand_scaled; } //note: make sure rand() was already initialized using srand()

PS I was looking for other questions like this. However, it was difficult to filter out a small subset of the questions that discuss random integers instead of random floating point numbers.

+9

c algorithm integer statistics uniform

solvingPuzzles Jul 25 2018-12-12T00:

source share

4 answers

Suppose rand () generates a uniformly distributed value of i in the range [0..RAND_MAX], and you want to create a uniformly distributed value of O in the range [L, H].

Suppose that I in is in the range [0..32767], and O is in the range [0..2].

According to your proposed method, O = I% 3. Note that in this range there are 10923 numbers for which I% 3 = 0, 10923 for which I% 3 = 1, but only 10922 for which I% 3 = 2 Therefore, your method will not evenly display the value from self to O.

As another example, suppose O is in the range [0..32766].

According to your proposed method, O = I% 32767. Now you get O = 0 for i = 0 and i = 32767. Therefore, 0 is twice as much as any other value - your method is again ambiguous.

A method for generating uniform display is proposed:

Calculate the number of bits needed to store a random value in the range [L, H]:
unsigned int nRange = (unsigned int) H - (unsigned int) L + 1,
unsigned int nRangeBits = (unsigned int) ceil (log ((double (nRange) / log (2.));
Generate random bits nRangeBits
this can be easily implemented by shifting-right the result of rand ()
Make sure that the generated number is not greater than HL. If it is - repeat step 2.
Now you can match the generated number in O simply by adding L.

+11

Lior Kogan Jul 25 '12 at 8:27

source share

I think it is known that rand () is not very good. It depends on how good the "random" data you need is.

I suppose you could write a test and then calculate the chi-square value to see how good your uniform generator is:

http://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test

Depending on your use (do not use this for your online shooter), you may consider LFSR

http://en.wikipedia.org/wiki/Linear_feedback_shift_register

It can be faster if you just need some kind of pseudo-random output. In addition, presumably, they can be homogeneous, although I have not studied mathematics enough to support this statement.

+3

Josh Petitt Jul 25 '12 at 2:11

source share

The version that fixes distribution errors (marked by Lior) includes the high bits returned by rand () and uses only a whole math (if desired):

 int uniform_distribution(int rangeLow, int rangeHigh) { int range = rangeHigh - rangeLow + 1; //+1 makes it [rangeLow, rangeHigh], inclusive. int copies=RAND_MAX/range; // we can fit n-copies of [0...range-1] into RAND_MAX // Use rejection sampling to avoid distribution errors int limit=range*copies; int myRand=-1; while( myRand<0 || myRand>=limit){ myRand=rand(); } return myRand/copies+rangeLow; // note that this involves the high-bits }

// note: make sure rand () is already initialized with srand ()

This should work well provided that range much smaller than RAND_MAX , otherwise you will return to the problem that rand() not a good random number generator in terms of its least significant bits.

+1

Dave Jul 25 2018-12-12T00:

source share

jxh · Accepted Answer · 2012-07-25 03:06

In some implementations, rand() did not provide good randomness on its low-order bits, so the module operator would not provide very random results. If you find this to be the case, you can try this instead:

 int uniform_distribution(int rangeLow, int rangeHigh) { double myRand = rand()/(1.0 + RAND_MAX); int range = rangeHigh - rangeLow + 1; int myRand_scaled = (myRand * range) + rangeLow; return myRand_scaled; }

Using rand() , this method will result in an offset, as Lior notes. But, the technique is great if you can find a uniform number generator to calculate myRand . One possible candidate would be drand48() . This will significantly reduce the amount of bias towards something that would be very difficult to detect.

However, if you need something cryptographically secure, you should use the algorithm described in Lior's answer, assuming your rand() itself is cryptographically secure (probably not by default, so you need to find it). The following is a simplified implementation of what Lior described. Instead of counting the bits, we assume that the range is within the range of RAND_MAX , and calculate the appropriate number. In the worst case, the algorithm terminates the call of the random number generator twice on average by request for a number in the range.

 int uniform_distribution_secure(int rangeLow, int rangeHigh) { int range = rangeHigh - rangeLow + 1; int secureMax = RAND_MAX - RAND_MAX % range; int x; do x = secure_rand(); while (x >= secureMax); return rangeLow + x / (secureMax / range); }

Creating an even distribution of INTEGERS in C

More articles: