Worst case of Quicksort algorithm

Question

Worst case of Quicksort algorithm

I found many implementations of the quicksort algorithm, but in the end I decided to stick with this:

public static void quickSort(int array[], int start, int end) { if(end <= start || start >= end) { } else { int pivot = array[start]; int temp = 0 ; int i = start+1; for(int j = 1; j <= end; j++) { if(pivot > array[j]) { temp = array[j]; array[j] = array[i]; array[i] = temp; i++; } } array[start] = array[i-1]; array[i-1] = pivot; quickSort(array, start, i-2); quickSort(array, i, end); }}

There are a few things that I'm confused about.
Why do some people suggest making the first element as a fulcrum, others say to select the middle element, and some say that you should choose the last element as a fulcrum, wouldn't it be different?
Let's say I'm trying to show why if an array is sorted, quick sort will have O (n ^ 2) as the worst growth order.
I have the following array:
{1, 2, 3, 4, 5, 6}.
If I select the first element as my rotation element, will it not compare it with all other elements, and then just change it to itself and just be O (n)? Then it will continue to two lines, which are O (logn)

 quickSort(array, start, i-2); quickSort(array, i, end);

So in the end, even if it's an ordered list of integers, will it still be O (nlogn)?

If I decided to pick up my last element, like my articulated element, would it be completely different? It will replace 6 and 1, and therefore, it will perform operations that are completely different compared to when the control is the first element in the array.

I just don't understand why the worst case is O (n ^ 2).

Any help would be greatly appreciated!

+7

sorting arrays algorithm time-complexity quicksort

Nicky Nov 07 '16 at 23:24

source share

3 answers

QuickSort starts by moving everything that gets a higher value than the rotation value at the end of the list, and everything that gets a lower value to start the list.

If the value at your pivot point is the smallest value in the list, then each value in the list will be moved to the end of the list. However, to determine where to move all of these values, O(n) work is required. If you then select the second lowest value, and then the third lowest value, etc., then you will finish O(n) work n/2 times. O(n²/2) simplifies to O(n²) .

Why do some people suggest making the first element as a fulcrum, others say to select the middle element, and some say that you should choose the last element as a fulcrum, wouldn't it be different?

The thing is to try to guess (without scanning the entire list) which element is most likely to be close to the median of your data set, thereby bringing it as close as possible to the best behavior.

If the data is completely random, then it doesn’t matter what you choose - you're equally likely to get a good fulcrum, and your chances of successively choosing the worst turning points are very small. Choosing the first or last value is the easiest option that works.
If your data is preconfigured (or basically so), choosing the middle will probably give you one of the best values, while choosing the first or last element will constantly give you the worst points of support.

In real life, the likelihood of dealing with data that is mostly sorted is high enough to probably cost a bit more complex code complexity. A Wikipedia section on this topic can be useful.

+3

Stripling warrior Nov 07 '16 at 23:44

source share

The following is an operational sorting that uses median 3, a variation of the Hoare section that excludes middle elements equal to the axes (they are already sorted), and limits the stack complexity to O (log (n)), using only recursion to a smaller part, then looping back for the most part. The worst time complexity is still O (n ^ 2), but this will require median 3 to repeatedly select small or large values. The best case of O (n) occurs when all values are the same (due to the exclusion of average values equal to the axis of rotation). The complexity of the time can be limited to O (n log (n)) using the median of the medians, but the overhead for this makes the average case much slower (I wonder if it ends more slowly than the type of heap. With the median environment, it is definitely slower, than merge sort, but merge sort requires a second array of the same size or 1/2 the size of the original array).

http://en.wikipedia.org/wiki/Median_of_medians

Introsort solves the most complex time complexity by switching to sorting heaps based on recursion level.

http://en.wikipedia.org/wiki/Introsort

 typedef unsigned int uint32_t; void QuickSort(uint32_t a[], size_t lo, size_t hi) { while(lo < hi){ size_t i = lo, j = (lo+hi)/2, k = hi; uint32_t p; if (a[k] < a[i]) // median of 3 std::swap(a[k], a[i]); if (a[j] < a[i]) std::swap(a[j], a[i]); if (a[k] < a[j]) std::swap(a[k], a[j]); p = a[j]; i--; // Hoare partition k++; while (1) { while (a[++i] < p); while (a[--k] > p); if (i >= k) break; std::swap(a[i], a[k]); } i = k++; while(i > lo && a[i] == p) // exclude middle values == pivot i--; while(k < hi && a[k] == p) k++; // recurse on smaller part, loop on larger part if((i - lo) <= (hi - k)){ QuickSort(a, lo, i); lo = k; } else { QuickSort(a, k, hi); hi = i; } } }

+3

rcgldr Nov 08 '16 at 1:47

source share

user1952500 · Accepted Answer · 2016-11-07T23:37:13+0000

The whole point of Quicksort is to find a bar that splits the array into two approximately equal parts. This is where you get log(n) from.

Suppose there is an array of size n , and at each iteration you can split the array into equal parts. Then we have:

 T(n) = 2 * T(n / 2) + O(n) = 4 * T(n/4) + 2 * O(n) . . (log(n) steps) . . = 2^log(n) * T(1) + log(n) * O(n) = n * O(1) + O(n * log(n)) = O(n * log(n))

Now, if we divide the array into sizes, say 1 and n-1 , we get:

 T(n) = T(1) + T(n-1) + O(n) = T(n-1) + O(n) = T(n-2) + O(n-1) + O(n) = T(n-3) + O(n-2) + O(n-1) + O(n) . . (n-1) steps . . = T(1) + O(2) + O(3) + ... + O(n) = O(1 + 2 + 3 + .... + n) = O(n^2)

In the case you mention, both of the following will not be individually O(log(n)) . One will be O(1) , and the other will be T(n-1) if the array is sorted. Therefore, you get O(n^2) complexity.

 quickSort(array, start, i-2); // should be constant time quickSort(array, i, end); // should be T(n-1)

And as @MarkRansom mentions below, this does not apply to sorted arrays. In the general case, if you select anchor points so that the array is not evenly divided into partitions, you will encounter such worst difficulties. For example, if the array is not sorted, but you always choose the maximum (or minimum depending on your implementation) for the fulcrum, you will encounter the same problem.

Worst case of Quicksort algorithm

More articles: