18 trillion coins, where did I go wrong?

Why does the following C code give me different results on my desktop and server, both work with similar versions of Linux?

He finds the longest same side in a string sequence of 18 trillion coin tosses. [Cm. The science fiction novel by Ian M. Banks. Consider Phlebas.]

On the server, after 15.7 trillion coin tones (it still works), the longest same side in the string sequence is still 29. Since 2^44 = 17,592,186,044,416 , I expect the longest same sequence to be somewhere in the middle 40s and probably 44 after completing a total of 18 trillion.

On the desktop, after only 4.7 billion coins were thrown, the longest sequence was already 31, since 2^31 = 2,147,483,648 , and it sounded right.

So, why did I get a sequence of 29 on the server after 15.7 trillion coins, but a sequence of 31 after only 4.7 billion on my desktop?

Module bias was my first thought. RAND_MAX same on both the desktop and the server, 2,147,483,647 (32 bits, signed for a long time). So the rand() function will give me the number 0 <= rand() <= 2,147,483,647 . 0 is equal, and 2,147,483,647 is odd, so if I’m not very mistaken, there is no modulo offset introduced by my line of code int rand_num = (rand() % 2); .

I know that the C standard pseudo-random number generator is not considered sufficient for cryptography. Of course, this could not be a factor in the generation, however, of rather long sequences of zeros and ones. Can it?

Here's the source:

Compiled on both machines using: gcc -O3 -o 18TCT 18TrillionCoinTosses.c

 #include <stdio.h> #include <stdlib.h> #include <time.h> int main(int argc, char* argv[]) { srand(time(NULL)); int current_seq = 0; int longest_seq = 0; int prev_rand_num = -1; long long i = 0; long long total = 18000000000000; // To serve as a rudimentary progress indicator. long billion_counter = 0; long billion = 1000000000; while (i < total) { int rand_num = (rand() % 2); if (rand_num == prev_rand_num) { current_seq++; if (current_seq >= longest_seq) { longest_seq = current_seq; printf("Longest sequence so far: %d (on iteration %lli)\n", longest_seq, i); } } else current_seq = 1; if (billion_counter == billion) { billion_counter = 0; printf("Progress report, current iteration: %lli\n", i); } prev_rand_num = rand_num; i++; billion_counter++; } printf("\nTotal coins tossed: %lli\n", i); printf("Longest sequence: %d\n", longest_seq); } 
+3
c random
Mar 02 '16 at 17:49
source share
4 answers

Your random number generator probably repeats after calling 2 ^ 32 = 4294967296, so you are not actually simulating 18 trillion samples. You need a better RNG that stores more than 32 bits of internal state. On many systems, you can access the best RNG by simply calling random() instead of rand() . (My man random system says “random is the best random number generator” and “The period of this random number generator is very large, approximately 16 * ((2 ** 31) -1).” Although it is “only” 34 359 738 352, that your 18 trillion is still missing.)

In addition, as a side item, rand() % 2 is risky, although in most hydraulic fracturing these days there is no problem that will burn there (and if you had this problem, you would know that, because among other things that you will get 0 per line, no matter what).




Appendix: you can find links to some other, better random number generators on question 13.15 in the list of frequently asked questions: http://c-faq.com/lib/rand.html .

+6
Mar 02 '16 at 18:20
source share

Despite the fact that your “random” bit 0 had equal zeros and ones, the sequence of pseudo random generators rand() repeated relatively often. In my test, it repeats after 2147483648 (2 ** 31) loop iterations. Thus, it makes no sense to collect up to 18 trillion. I checked the test several times, always the same result.

 #include <stdio.h> #include <stdlib.h> #include <time.h> int main(void) { unsigned long long n = 0; int a, b, c, d; int e, f, g, h; srand((unsigned)time(NULL)); e = a = rand(); f = b = rand(); g = c = rand(); h = d = rand(); do { n++; e = f; f = g; g = h; h = rand(); } while (e != a || f != b || g != c || h != d); printf("%llu\n", n); } 
+4
Mar 02 '16 at 18:15
source share

Your code looks fine. The problem may be that you are using RNG.

I don't think rand ()% 2 is uniform. Look here: Uniformity of random numbers taken modulo N

Why not C ++ 11 random number generators? http://en.cppreference.com/w/cpp/numeric/random/uniform_int_distribution

And last but not least, can -O3 spoil something?

-O3 Optimize even more. -O3 includes all optimizations specified by -O2, and also includes -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-loop-distribute -patrons , -fsplit-paths -ftree-slp-vectorize, -fvect-cost-model, -free-partial-pre and -fipa-cp-clone.

+2
Mar 02 '16 at 18:05
source share

As others have noted, rand not a reliable source of randomness. This is right in the man page :

 NAME rand, rand_r, srand, sranddev -- bad random number generator ... DESCRIPTION These interfaces are obsoleted by arc4random(3). 

For good luck, you'll have to go beyond the standard C libraries.

Note that if you are on a Mac, it will complain that RAND_bytes() deprecated. Don’t worry, OpenSSL is here to stay. Troubleshooting issues related to binary compatibility issues when upgrading Apple products .

+1
Mar 02 '16 at 20:08
source share



All Articles