Is it foolish to test fuzz with a cryptographically weak pseudo-random generator?

When working on a large software project, I often use fuzz testing as part of my test cases to cause errors that can only occur when the input reaches a certain size or shape. I did this most often just by using the standard random number functions that are related to the programming language I use.

Recently, I began to wonder, ignoring the advantages or disadvantages of fuzz testing in general, regardless of whether it is good to use non-cryptographically protected pseudo-random number generators when doing fuzz testing. Weak random number generators often exhibit patterns that distinguish them from true random sequences, even if these patterns are not always obvious. It seems that a fuzz test using weak PRNG can always cause certain hidden errors that appear only in certain circumstances, because pseudo-random numbers can be related to each other in such a way as to never trigger these circumstances.

Is it foolish to use weak PRNG to test fuzz? If it is theoretically unreasonable to do this, is it reasonable in practice?

+7
source share
3 answers

You mislead two different meanings of "weakness":

  • statistical weakness means that the conclusion of PRNG demonstrates statistical models, such as the presence of certain sequences that are more common than others. This can lead to ineffective fluff testing in some rare cases. Statistically strong PRNGs are productive and widely available (most notably Mersenne Twister).
  • cryptographic weakness means that the output of the RNG is somehow predictable, given knowledge other than the seed (for example, the conclusion itself). It makes no sense to require that the PRNG used to test fuzz be cryptographically strong, because β€œpatterns” demonstrated by statistically strong but cryptographically weak PRNGs are largely just a problem if you need to prevent a cryptographically perfect attacker from predicting an exit.
+6
source

I don’t think it really matters, but I can’t prove it.

Fuzz testing will try only a few inputs, in most cases a small part of the possibilities. No matter how well RNG is used, it may or may not find one of the inputs that violates your code, depending on how much of all possible inputs violates your code. If the pattern in PRNG is very simple, it seems unlikely to me that it will in any way match the pattern in the "bad" inputs you are looking for, so it will hit it no more and no less than a true random one.

In fact, if you knew how to choose an RNG to maximize the likelihood that it would find bad inputs, you could probably use this knowledge to find the error faster ...

I do not think you should use a really bad PRNG. rand , for example, is allowed to demonstrate very simple patterns, such as alternating LSBs. And if your code uses PRNG internally, you probably want to avoid using the same PRNG in the same way in the test, just to make sure that you do not accidentally check only those cases where the input matches the internal generated stream! A small risk, of course, since you hope that they will use different seeds, but still.

As a rule, in this language it is not so difficult to find cryptographic or, at least, safe hash libraries. SHA-1 is everywhere and easy to use to create a stream, or if RC4 is trivial to implement on your own. Both provide pretty good PRNGs, if not as secure as Blum Blum Shub. I would think that the main problem is speed - if, for example, Mersenne Twister can generate tests with an error 10 times faster and the code under test is fast enough, then it may have a better chance of finding bad inputs in this one regardless of the fact that, given the 624 outputs, you can infer the full state of the RNG ...

+4
source

You do not need an unpredictable source (this is the cryptographically safe generator), you only need a source with good statistical properties.

Thus, using a general-purpose generator is sufficient - it is fast and usually reproducible (which means that problems can also be reproduced).

+2
source

All Articles