Why does this simple shuffling algorithm produce biased results? what is the simple reason?

Question

Why does this simple shuffling algorithm produce biased results? what is the simple reason?

this simple shuffling algorithm seems to produce biased results:

# suppose $arr is filled with 1 to 52 for ($i < 0; $i < 52; $i++) { $j = rand(0, 51); # swap the items $tmp = $arr[j]; $arr[j] = $arr[i]; $arr[i] = $tmp; }

you can try ... instead of using 52, use 3 (suppose that only 3 cards are used), and run it 10,000 times and count the results, you will see that the results are distorted by certain patterns ..

the question is ... what is the simple explanation that this will happen?

the right solution is to use something like

 for ($i < 0; $i < 51; $i++) { # last card need not swap $j = rand($i, 51); # don't touch the cards that already "settled" # swap the items $tmp = $arr[j]; $arr[j] = $arr[i]; $arr[i] = $tmp; }

but the question is, why would the first method, seemingly also completely random, make the results biased?

Update 1: thanks, if people indicate that it must be rand ($ i, 51) for proper playback.

+18

math algorithm shuffle

太極者無極而生 May 13, '09 at 17:18

source share

11 answers

See this:
The danger of the naive (the horror of coding)

As an example, consider three decks of cards. Using a 3-card deck, after shuffling, there are only 6 possible orders: 123, 132, 213, 231, 312, 321.

With your 1st algorithm, there are 27 possible paths (outcomes) for the code, depending on the results of the rand() function at different points. Each of these results is equally probable (impartial). Each of these results will be mapped to the same result from a list of 6 possible "real" shuffle results above. Now we have 27 elements and 6 buckets. Since 27 is not evenly divided by 6, some of these 6 combinations should be overrepresented.

In the second algorithm, there are 6 possible results that exactly correspond to 6 possible "real" shuffle results, and they should all be presented equally in time.

This is important because buckets that are overrepresented in the first algorithm are not random. Buckets selected for bias are repeatable and predictable. So if you are building an online poker game and using the first algorithm, the hacker might understand that you used a naive variety, and it follows that some deck mechanisms are much more likely than others. Then they can bet accordingly. They will lose some, but they will win much more than they will lose, and will quickly get you out of business.

+33

Joel Coehoorn May 13 '09 at 17:21

source share

From your comments on the other answers, it seems that you are not only looking at an explanation of why the distribution is not a uniform distribution (for which the answer to divisibility is simple), but also an “intuitive” explanation of why it is actually far from uniform .

Here is one way to look at it. Suppose you start with an initial array [1, 2, ..., n] (where n can be 3, or 52, or something else) and apply one of two algorithms. If all permutations are equally probable, then the probability that 1 will remain in the first position should be 1/n . Indeed, in the second (correct) algorithm, this is 1/n , since 1 remains in place if and only if it is not replaced by the first, that is, if the initial call to rand(0,n-1) returns 0.
However, in the first (wrong) algorithm, 1 remains untouched only if it is not replaced by either the first or the other time - that is, only if the first rand returns 0, and none of the other rand returns 0, the probability of which is ( 1 / n) * (1-1 / n) ^ (n-1) ≈ 1 / (ne) ≈ 0.37 / n, not 1 / n.

And this is an “intuitive” explanation: in your first algorithm, earlier elements are much more likely to change places than later ones, so the permutations you get are distorted with respect to patterns in which the early elements are not in their original places.

(This is a little more subtle than this, for example, 1 can be swapped to a later position and still ultimately get replaced by a complex series of swaps, but these probabilities are relatively less significant.)

+17

ShreevatsaR May 13, '09 at 20:52

source share

The best explanation I've seen for this effect was from Jeff Atwood on his blog CodingHorror ( The Danger of Naïveté ).

Using this code to simulate a random shuffle with 3 cards ...

 for (int i = 0; i < cards.Length; i++) { int n = rand.Next(cards.Length); Swap(ref cards[i], ref cards[n]); }

... you get this distribution.

Distribution of 3-card shuffle

The shuffle code (see above) gives 3 ^ 3 (27) possible deck combinations. But math tells us that there really are only 3! or 6 possible combinations of a 3-card deck. Thus, some of the combinations are overrepresented.

You will need to use the Fisher-Yates shuffle to correctly (randomly) shuffle the deck of cards.

+15

Robert Cartaino May 13 '09 at 17:21

source share

See Coding Horror The Danger of Naïveté .

In principle (3 cards each):

A naive shuffle leads to 33 (27) possible combinations of decks. This is strange because mathematics tells us that there are really only 3! or 6 possible combinations of 3 deck cards. In KFY shuffle, we start with the initial order, swap from the third position with any of the three cards, and then swap from the second position with the remaining two cards.

+2

eKek0 May 13, '09 at 17:34

source share

Here's another intuition: a single exchange at random cannot create a symmetry of probability to take a position if at least two-way symmetry already exists. Call three positions A, B and C. Now let a be the probability that card 2 is in position A, b be the probability that card 2 is in position B, and c is the probability that it is in position C, earlier to the swap. Suppose the two probabilities do not match: a! = B, b! = C, c! = A. Now, calculate the probabilities a ', b' and c 'of the card located in these three positions after the swap. Let's say that this swap change consists of position C, which has changed to one of three positions in random order. Then:

 a' = a*2/3 + c*1/3 b' = b*2/3 + c*1/3 c' = 1/3.

That is, the probability that the card drops to position A is the probability that there was already a time when 2/3 of the temporary position A is not involved in the swap, plus the probability that it was in position C once 1/3 of the probability that C exchanges with A, etc. Now subtracting the first two equations, we get:

 a' - b' = (a - b)*2/3

which means that since we assumed a! = b, then a '! = b' (although the difference will approach 0 over time, given sufficient swaps). But since a + b '+ c' = 1, if a '! = B', then none of them can be equal to c ', which is 1/3. Therefore, if three probabilities begin with different options before the swap, they will also differ after the exchange. And it does not depend on which position has been replaced - we just change the roles of the variables in the above.

Now the very first swap has begun by replacing card 1 in position A with one of the others. In this case, there was bilateral symmetry before the swap, since the probability of card 1 at position B = the probability of card 1 at position C = 0. So, in fact, card 1 can end with symmetric probabilities, and as a result, it is in each of the three positions with equal probability. This remains true for all subsequent swaps. But card 2 is gaining momentum in three positions after the first swap with probability (1/3, 2/3, 0), and also card 3 enters the top three with probability (1/3, 0, 2/3), therefore, independently on how many subsequent swaps we make, we will never end with a card 2 or 3, which has exactly the same probability of occupying all three positions.

+2

dewtell May 13 '09 at 20:48

source share

The simple answer is that there are 52 ^ 52 possible methods for this algorithm, but there are only 52 of them! Possible arrangements of 52 cards. In order for the algorithm to be fair, it must ensure the same use of each of these devices. 52 ^ 52 is not an integer multiple of 52 !. Therefore, some measures should be more likely than others.

+1

user44511 May 15 '09 at 1:40

source share

An illustrative approach may be as follows:

1) consider only 3 cards.

2) in order for the algorithm to give uniformly distributed results, the probability "1" ending in [0] must be 1/3, and the probability "2" ending in [1] must be 1/3, etc. d.

3), so if we look at the second algorithm:

the probability that "1" will be at the level [0]: when 0 is a random number generated, therefore 1 case from (0,1,2), therefore, is 1 of 3 = 1/3
the probability that “2” will be at the level [1]: when it was not replaced by [0] the first time, and it was not replaced by [2] the second time: 2/3 * 1/2 = 1/3
the probability that “3” will be at the level [2]: when it was not replaced by [0] the first time, and it was not replaced by [1] the second time: 2/3 * 1/2 = 1 / 3
all of them are excellent 1/3, and we are not visible errors.

4) if we try to calculate the probability of “1”, which ends in [0] in the first algorithm, the calculation will be a little long, but, as the illustration in the answer of lassevk shows, this is 9/27 = 1/3, but “2”, ending with [1] has a chance of 8/27, and “3” ending with [2] has a chance of 9/27 = 1/3.

as a result, “2” ending in [1] is not 1/3, and therefore the algorithm will produce a rather distorted result (about 3.7% error, unlike any minor case, such as 3/10000000000000 = 0.00000000003% )

5) evidence that Joel Coehorn can indeed prove that some cases will be overrepresented. I think the explanation of why this is n ^ n is this: at each iteration, there is a chance that a random number can be, so after n iterations there can be n ^ n cases = 27. This number does not divide the number of permutations (n! = 3! = 6) uniformly in the case n = 3; therefore, some results are overrepresented. they are re-presented in such a way that instead of appearing 4 times, it appears 5 times, so if you shuffle the cards millions of times from the original order from 1 to 52, the presented case will be displayed 5 million times, unlike 4 million times, which is a pretty big difference.

6) I think the over-view is shown, but “why” will the over-view happen?

7) the final test for the correctness of the algorithm is that any number has a probability of 1 / n in any slot.

+1

太極者無極而生 May 18 '09 at 9:12 a.m.

source share

Here's an excellent analysis of the Markov shuffle chain map. Oh wait, that's all math. Sorry. :)

0

JP Alioto May 13, '09 at 23:50

source share

The Naive algorithm selects the values of n like this:

n = rand (3)

3 ^ 3 possible combinations n

1,1,1, 1,1,2 .... 3,3,2 3,3,3 (27 combinations) lassevk answer shows the distribution among the cards of these combinations.

best algorithm:

n = rand (3)

n = rand (2)

P! possible combinations n

1,1, 1,2, 2,1 2,2 3,1 3,2 (6 combinations, all of which give a different result)

As mentioned in other answers, if you take 27 attempts to get 6 results, you cannot achieve 6 results with a uniform distribution, since 27 is not divisible by 6. Put 27 marbles in 6 buckets and whatever you make, some buckets will have more marble than others, the best you can do is 4,4,4,5,5,5 marbles for buckets 1 through 6.

The main problem with the naive shuffling is that the swaps are too many times to completely shuffle 3 cards, you only need to make 2 swaps, and the second swap should only be among the first two cards, since the 3rd card already had 1/3 probability of replacement. to continue exchanging cards there will be more chances that this card will be replaced, and these chances will only be equal to 1/3, 1/3, 1/3 if your total swap combinations are divided by 6.

0

Atilio Jobson May 18 '09 at 4:43 pm

source share

One more answer is not needed, but I found it appropriate to try to pinpoint why Fisher Yates is uniform.

If we are talking about a deck with N elements, then this question is: how can we show that

 Pr(Item i ends up in slot j) = 1/N?

Breaking it with conditional probabilities, Pr(item i ends up at slot j) is equal to

 Pr(item i ends up at slot j | item i was not chosen in the first j-1 draws) * Pr(item i was not chosen in the first j-1 draws).

and from there he recursively returns to the first drawing.

Now the probability that element i not drawn in the first figure is N-1 / N And the probability that he was not drawn in the second draw is due to the fact that he was not drawn in the first drawing, is equal to N-2 / N-1 , etc.

So, we get the probability that the i element was not drawn in the first j-1 draw:

 (N-1 / N) * (N-2 / N-1) * ... * (Nj / N-j+1)

and, of course, we know that the probability that it is drawn in round j conditionally, if it was not drawn earlier, is just 1 / Nj .

Note that in the first term, the numerators all cancel the subsequent denominators (i.e. N-1 cancels, N-2 cancels, up to N-j+1 cancels, leaving only Nj / N ).

Thus, the total probability of occurrence of element i in slot j is equal to:

 [(N-1 / N) * (N-2 / N-1) * ... * (Nj / N-j+1)] * (1 / Nj) = 1/N

as was expected.

To get a more general idea of "simple shuffle," a specific property that it lacks is called exchangeability . Due to the “path dependency” on the method of creating the shuffle (that is, which of the 27 paths is used to create the output), you cannot process different random variables as if they could appear in any order. Actually, this is perhaps , a motivating example of why exchangeability matters in random sampling.

0

ely Oct 31 '14 at 13:26

source share

Lasse Vågsæther Karlsen · Accepted Answer · 2009-05-13 20:18

Here is a complete probability tree for these replacements.

Suppose you start with the sequence 123, and then list all the different ways to get random results with this code.

 123 +- 123 - swap 1 and 1 (these are positions, | +- 213 - swap 2 and 1 not numbers) | | +- 312 - swap 3 and 1 | | +- 231 - swap 3 and 2 | | +- 213 - swap 3 and 3 | +- 123 - swap 2 and 2 | | +- 321 - swap 3 and 1 | | +- 132 - swap 3 and 2 | | +- 123 - swap 3 and 3 | +- 132 - swap 2 and 3 | +- 231 - swap 3 and 1 | +- 123 - swap 3 and 2 | +- 132 - swap 3 and 3 +- 213 - swap 1 and 2 | +- 123 - swap 2 and 1 | | +- 321 - swap 3 and 1 | | +- 132 - swap 3 and 2 | | +- 123 - swap 3 and 3 | +- 213 - swap 2 and 2 | | +- 312 - swap 3 and 1 | | +- 231 - swap 3 and 2 | | +- 213 - swap 3 and 3 | +- 231 - swap 2 and 3 | +- 132 - swap 3 and 1 | +- 213 - swap 3 and 2 | +- 231 - swap 3 and 3 +- 321 - swap 1 and 3 +- 231 - swap 2 and 1 | +- 132 - swap 3 and 1 | +- 213 - swap 3 and 2 | +- 231 - swap 3 and 3 +- 321 - swap 2 and 2 | +- 123 - swap 3 and 1 | +- 312 - swap 3 and 2 | +- 321 - swap 3 and 3 +- 312 - swap 2 and 3 +- 213 - swap 3 and 1 +- 321 - swap 3 and 2 +- 312 - swap 3 and 3

Now the fourth column of numbers, one before the exchange information, contains the final result with 27 possible results.

You can calculate how many times each template happens:

 123 - 4 times 132 - 5 times 213 - 5 times 231 - 5 times 312 - 4 times 321 - 4 times ============= 27 times total

If you run code that swaps arbitrarily an infinite number of times, patterns 132, 213, and 231 will occur more often than patterns 123, 312, and 321, simply because the swap exchange method makes this more likely to happen.

Now, of course, you can say that if you run the code 30 times (27 + 3), you can get all the patterns that happen 5 times, but when working with statistics you should look at the long-term perspective of the trend.

Here's the C # code that explores randomness for one of the possible patterns:

 class Program { static void Main(string[] args) { Dictionary<String, Int32> occurances = new Dictionary<String, Int32> { { "123", 0 }, { "132", 0 }, { "213", 0 }, { "231", 0 }, { "312", 0 }, { "321", 0 } }; Char[] digits = new[] { '1', '2', '3' }; Func<Char[], Int32, Int32, Char[]> swap = delegate(Char[] input, Int32 pos1, Int32 pos2) { Char[] result = new Char[] { input[0], input[1], input[2] }; Char temp = result[pos1]; result[pos1] = result[pos2]; result[pos2] = temp; return result; }; for (Int32 index1 = 0; index1 < 3; index1++) { Char[] level1 = swap(digits, 0, index1); for (Int32 index2 = 0; index2 < 3; index2++) { Char[] level2 = swap(level1, 1, index2); for (Int32 index3 = 0; index3 < 3; index3++) { Char[] level3 = swap(level2, 2, index3); String output = new String(level3); occurances[output]++; } } } foreach (var kvp in occurances) { Console.Out.WriteLine(kvp.Key + ": " + kvp.Value); } } }

It is output:

 123: 4 132: 5 213: 5 231: 5 312: 4 321: 4

So, although this answer really counts, this is not a purely mathematical answer, you just need to evaluate all the possible ways a random function can work, and look at the final results.

Why does this simple shuffling algorithm produce biased results? what is the simple reason?

More articles: