Should I use "rand% N" or "rand () / (RAND_MAX / N + 1)"?

I read the C FAQ and found out in the question that it recommends me to use rand() / (RAND_MAX / N + 1) instead of the more popular way that rand() % N

The reason for this is that when N is a low number of rand() % N will only use a few bits from rand() .

I tested various approaches with N 2 both Windows and Linux, but did not notice the differences.

 #include <stdio.h> #include <stdlib.h> #include <time.h> #define N 2 int main(void) { srand(0); printf("rand() %% N:\n"); for (int i = 0; i < 40; ++i) { printf("%d ", rand() % N); } putchar('\n'); srand(0); printf("rand() / (RAND_MAX / N + 1):\n"); for (int i = 0; i < 40; ++i) { printf("%d ", rand() / (RAND_MAX / N + 1)); } putchar('\n'); return 0; }

The output is this (on my gnu / linux machine):

 rand() % N: 1 0 1 1 1 1 0 0 1 1 0 1 0 1 1 0 0 0 0 0 1 0 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 0 1 0 rand() / (RAND_MAX / N + 1): 1 0 1 1 1 0 0 1 0 1 0 1 0 1 1 1 1 1 0 1 0 0 0 1 0 0 0 0 1 0 1 1 1 0 1 1 0 1 0 1

Both alternatives seem completely random to me. The second approach seems to be worse than rand % N

Should I use rand() % N or rand() / (RAND_MAX / N + 1) ?

+7

c algorithm random numbers

wefwefa3 Dec 30 '15 at 13:49

source share

3 answers

If N is a power of two, using the remainder technique is usually safe ( RAND_MAX usually has a power of two minus 1, so the whole range has a power of two lengths). In the general case, N needs to split the range of rand() to avoid bias.

Otherwise, you will encounter this problem , regardless of the quality of rand() . In short, the problem is that you chop this range into several “parts” of each length N , if N does not divide the range, then the last part will not be completed. Therefore, the numbers that were "cut off" from this part are less likely, since they have one smaller "part" from which they can be generated.

Unfortunately, rand() / (RAND_MAX / N + 1) also broken (in much the same way), so the real answer is: don't use any of them.

The problem described above is really fundamental, it is not possible to evenly distribute X different values over Y if Y does not divide X. You can fix this by rejecting some random samples so that Y will divide the new X.

+5

harold Dec 30 '15 at 13:55

source share

There is another problem with rand() % n , which is that it introduces a modular offset.

For simplicity, let's pretend that RAND_MAX is 7, and n is 6. You want the numbers 0, 1, 2, 3, 4, 5 to be displayed in a random stream with equal probability. However, 0 and 1 will be displayed 1/4 times, and the remaining digits will be only 1/8 times, since 6 and 7 have residues 0 and 1 respectively. You should use a different method, but be careful, because truncating fractions can lead to a similar problem.

If you have arc4random () , you can use arc4random_uniform() to achieve an unbiased distribution without having to be careful.

+4

Jeremyp Dec 30 '15 at 2:00 p.m.

source share

On avr-gcc:

I used rand() & 0xFF to get a random number from 0 to 255, and the results were not good. It turned out that the use of low-order bits is not a very reliable method, often the same values. May be similar to a module.

rand() / (RAND_MAX / N + 1) worked much better for me

0

Michal Dobrodenka Jun 15 '15 at 14:29

source share

All Articles