How does random shuffling in quicksort help improve code efficiency?

I have lectured by Robert Sedgwick on algorithms, and he explains that random shuffling ensures that we don't run into the worst quadratic time scenario in quick sort. But I can’t figure out how to do this.

+8
sorting algorithm shuffle
source share
5 answers

This is really a recognition that, although we often talk about the complexity of the average case, we hardly expect each case to be equally likely.

Sorting an already sorted array is the worst case in quicksort, because whenever you select a fulcrum, you find that all the elements are placed on one side of the axis, so you don’t split into approximately equal halves, and often in practice this one is already sorted the case will appear more often than in other cases.

Shuffling data randomly is a quick way to make sure that you really do all the cases with equal probability, and therefore this worst case will be as rare as any other case.

It is worth noting that there are other strategies that deal well with already sorted data, such as choosing the middle element as a support.

+11
source share

It is assumed that the worst case - everything that is already sorted - is frequent enough to be worried about, and shuffling is the magic with the least effort to avoid this case, not recognizing this, improving in this case you will move the problem to another, which accidentally shuffled into a sorted order. I hope that this bad case is much less common, and even if this happens, randomness means that the problem cannot be easily reproduced and blamed for this cheat.

The concept of improving the general case through the rare is beautiful. Chance as an alternative to actually thinking about which cases will be more or less common is somewhat sloppy.

+3
source share

In the case of a randomized QuickSort, since the rotation element is selected randomly, we can expect that the split of the input array will be quite balanced on average - unlike case 1 and (n-1), it splits in a non-randomized version of the algorithm. This helps prevent the worst-case QuickSort behavior that occurs in unbalanced partitions.

Therefore, the average travel time of a random version of QuickSort is O (nlogn), not O (n ^ 2);

+3
source share

What does random shuffling do for distribution in the input space? To understand this, consider the probability distribution P defined by the set S , where P not under our control. Create a probability distribution P' , applying a random random shuffle, over S to P In other words, every time we get a sample from P , we map it uniformly randomly into an element from S What can you say about this resulting distribution P' ?

 P'(x) = summation over all elements s in S of P(s)*1/|S| = 1/|S| 

Thus, P' is a uniform distribution over S Random shuffling gives us control over the distribution of input probabilities.

How does this relate to quicksort? Well, we know the average complexity of quicksort. This is calculated by the uniform probability distribution, and this is the property that we want to maintain on our input distribution, regardless of what it really is. To do this, we randomly shuffle our input array, ensuring that the distribution is not adversarial.

0
source share

Is the video in coursera ? Unfortunately, shuffle to reduce performance to O (N ^ 2) with data n, n, ..., n, 1,1, ..., 1. I checked Quick.java with nn11.awk that generate such data.

 $ for N in 10000 20000 30000 40000; do time ./nn11.awk $N | java Quick; done | awk 'NF>1' real 0m10.732s user 0m10.295s sys 0m0.948s real 0m48.057s user 0m44.968s sys 0m3.193s real 1m52.109s user 1m48.158s sys 0m3.634s real 3m38.336s user 3m31.475s sys 0m6.253s 
-one
source share

All Articles