Creating very large random numbers

How would you create a very large random number? I am thinking about the order of 2 ^ 10 ^ 9 (one billion bits). Any programming language - I assume that the solution translates into other languages.

I would like to get a uniform distribution on [1, N].

My initial thoughts:

- You can randomly generate each digit and combine. Problem: even very good pseudo-random generators are likely to develop patterns with millions of digits, right?

  • You could help create large random numbers by putting random numbers on random metrics. Problem: you must do the math so that the resulting number is still random, and you should be able to calculate it in a reasonable amount of time (for example, an hour).

  • If this helps, you can try to create a possibly uneven distribution on the smallest possible range (for example, using real numbers) and convert. Problem: it can be equally difficult.

Any ideas?

+7
source share
5 answers

Generate log2(N) random bits to get the number M , where M can be twice as large as N Repeat to M in the range [1;N] .

Now for generating random bits you can use a source of true randomness, which is expensive.

Or you can use some cryptographically secure random number generator, such as AES with a random key, which encrypts the counter for subsequent blocks of bits. Cryptographically secure means that there can be no noticeable patterns.

+4
source

NTL: a library for number theory

It was recommended by my teacher of coding and cryptography theory ... so I think it works correctly and it is quite easy to use.

RandomBnd, RandomBits, RandomLen - routines for generating pseudo random numbers

 ZZ RandomLen_ZZ(long l); // ZZ = psuedo-random number with precisely l bits, // or 0 of l <= 0. 
+2
source

It depends on what the data is for. For most purposes, PRNG is quick and easy. But they are not perfect. For example, I remember that Monte Carlos simulations of chaotic systems are really good at revealing the basic pattern in PRNG.

If this is what you are doing, however, there is a simple trick I learned at gradient school to create a lot of random data. Take a large (preferably fast changing) file. (Some large data structures from a working kernel are good.) Compression to increase entropy. Throw away the headlines. Then, for a good score, encrypt the result. If you plan to use this for cryptographic purposes (and you did not have an ideal set of entropy data for work), then cancel it and encrypt again.

The underlying theory is simple. Information theory tells us that there is no difference between a signal without redundancy and pure random data. Therefore, if we select a large file (i.e. a lot of signal), remove the redundancy with compression and split the headers, we have a pretty good random signal. Encryption does a really good job of removing artifacts. However, encryption algorithms tend to work in blocks. Therefore, if someone could, in spite of everything, guess what happens at the beginning of the file, this data is more easily guessed. But then flipping the file over again and encrypting means that they will need to know the whole file and our encryption in order to find any pattern in the data.

The reason for choosing a rapidly changing piece of data is that if you run out of data and want to generate more, you can return to the same source again. Even small changes after this process will turn into a substantially uncorrelated random data set.

+2
source

If you have a random number generator that generates random numbers from X bits. And the concatenated bits [X1, X2, ... Xn] create the number needed for N bits, if each X is random, I don’t understand why your large number will not be random, as well as for all intents and purposes. And if the standard C rand () method is not safe enough, I am sure that there are many other libraries (for example, mentioned in this thread) whose pseudo-random numbers are “more random”.

0
source

even very good pseudo-random generators are likely to design patterns with millions of digits, right?

From Wikipedia on the generation of pseudorandom numbers :

PRNG can be started from an arbitrary initial state using the initial state. It will always produce the same sequence after initialization with this state. The maximum length of a sequence before it begins to repeat is determined by the size of the state, measured in bits. However, since the length of the maximum period potentially doubles when each bit of the “state” is added, it is easy to create PRNGs with periods sufficient for many practical applications.

Perhaps you could create large random numbers by putting random numbers on random metrics

I assume that you suggest something like filling in the values ​​of scientific notation with random values?

For example: 1.58901231 x 10^5819203489

The problem is that your distribution will be logarithmic (or is it exponential? :) - the same difference, it's not even that). You will never get a value that has a millionth set of digits but contains a digit in one column.

you could try to generate a possibly uneven distribution on the smallest possible range (for example, using real numbers) and convert

Not sure if I understand that. It looks like the same as an exponential solution, with the same problems. If you are talking about multiplying by a constant, you get a lumpy distribution instead of a logarithmic (exponential?).

Recommended Solution

If you just need really big pseudorandom values ​​with a good distribution, use a large state PRNG algorithm. The frequency of a PRNG is often the square of the number of bits, so it doesn’t need to fill even a very large number, so to fill even a very large number.

From there you can use your first solution:

You can randomly generate each digit and concatenate

Although I would suggest using the entire range of values ​​returned by your PRNG (maybe 2 ^ 31 or 2 ^ 32), and populate the byte array with these values, dividing it as necessary. Otherwise, you can drop many bits of randomness. Also, scaling your values ​​to a range (or using modulo) can easily ruin your distribution, so there is another reason to try and keep the maximum number of bits your PRNG can return. However, be careful to pack an array of bytes full of bits returned, or you will again take a chunk to your distribution.

However, the problem with this solution is how to fill this (larger than usual) seed state with values ​​that are reasonably significant. You might be able to use standard-sized seeds (populated over time or GUID type populations) and populate your state with a large PRNG with values ​​from a smaller PRNG. This can work if it is not critical how well your numbers are distributed.

If you need truly cryptographically secure random values, the only real way to do this is to use a natural form of randomness, for example, at http://www.random.org/ . The disadvantages of natural randomness are accessibility and the fact that many natural random devices take some time to generate new entropy, so generating large amounts of data can be very slow.

You can also use the hybrid and be safe - only natural random seeds (to avoid slow generation) and PRNG for the rest. Reclaim periodically.

0
source

All Articles