Quicksort vs heapsort

Both quicksort and heapsort do in-place sorting. What's better? What are the applications and cases in which this preference?

+72
sorting algorithm quicksort heapsort
Mar 18 '10 at 5:45
source share
11 answers

This article has some analysis.

Also from Wikipedia:

The most direct competitor to quicksort is heapsort. Heapsort is generally somewhat slower than quicksort, but the worst-case execution time is always Θ (nlogn). Quick sorting is usually faster, although the worst-case performance is still possible, with the exception of the option with internal sorting, which switches to dynamic sorting when a bad case is detected. If you know in advance that a heapsort will be needed, its direct use will be faster than waiting to switch to it.

+45
Mar 18
source share

Heapsort is guaranteed to be O (N log N), which is much better than the worst case in Quicksort. Heapsort does not require more memory for another array to accommodate ordered data, as Mergesort requires. So why do commercial apps stick with Quicksort? What is Quicksort's feature compared to other implementations?

I checked the algorithms myself and saw that there really was something special about Quicksort. It works fast, much faster than Heap and Merge algorithms.

The secret to quick sorting is this: it almost does not make unnecessary permutations of elements. A swap is time consuming.

With Heapsort, even if all your data is already ordered, you are going to swap 100% of the elements to arrange the array.

Mergesort is even worse. You are going to write 100% of the elements to another array and write it back to the original, even if the data is already ordered.

With Quicksort, you do not change what is already ordered. If your data is completely streamlined, you won't change much! Although in the worst case there is a lot of noise due to a slight improvement in the choice of pivot, in addition to getting the first or last element of the array, it can be avoided. If you get a turn from an intermediate element between the first, last and middle element, it is enough to avoid the worst case.

What's better in quicksort is not the worst case, but the best case! At best, you make the same number of comparisons, fine, but you change almost nothing. In the average case, you change some of the elements, but not all elements, as in Heapsort and Mergesort. This is what gives Quicksort the best time. Less swap, more speed.

The following implementation in C # on my computer running in release mode is superior to Array.Sort in 3 seconds with an average rotation and 2 seconds with an improved rotation (yes, there are costs to getting a good rotation).

static void Main(string[] args) { int[] arrToSort = new int[100000000]; var r = new Random(); for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length); Console.WriteLine("Press q to quick sort, s to Array.Sort"); while (true) { var k = Console.ReadKey(true); if (k.KeyChar == 'q') { // quick sort Console.WriteLine("Beg quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff")); QuickSort(arrToSort, 0, arrToSort.Length - 1); Console.WriteLine("End quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff")); for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length); } else if (k.KeyChar == 's') { Console.WriteLine("Beg Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff")); Array.Sort(arrToSort); Console.WriteLine("End Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff")); for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length); } } } static public void QuickSort(int[] arr, int left, int right) { int begin = left , end = right , pivot // get middle element pivot //= arr[(left + right) / 2] ; //improved pivot int middle = (left + right) / 2; int LM = arr[left].CompareTo(arr[middle]) , MR = arr[middle].CompareTo(arr[right]) , LR = arr[left].CompareTo(arr[right]) ; if (-1 * LM == LR) pivot = arr[left]; else if (MR == -1 * LR) pivot = arr[right]; else pivot = arr[middle]; do { while (arr[left] < pivot) left++; while (arr[right] > pivot) right--; if(left <= right) { int temp = arr[right]; arr[right] = arr[left]; arr[left] = temp; left++; right--; } } while (left <= right); if (left < end) QuickSort(arr, left, end); if (begin < right) QuickSort(arr, begin, right); } 
+80
Feb 15 '15 at 3:55
source share

In most situations, fast or slow acceleration doesn't matter ... you just never want it to slow down from time to time. Although you can configure QuickSort to avoid slow situations, you lose the elegance of the underlying QuickSort. So for most things, I prefer HeapSort ... you can implement it in your simple simple elegance and never get a slow look.

In situations where you need maximum speed in most cases, QuickSort may be preferable to HeapSort, but none of them can be the correct answer. For critical situations, you should carefully study the details of the situation. For example, in some of my critical code, data is sorted or sorted very often (this is the indexing of several related fields that often either move up and down together or move up and down against each other, so as soon as you sort one by one , the rest are sorted or sorted in reverse order or closed ... any of them can kill QuickSort). In this case, I didn’t implement either ... instead, I implemented the Dijkstra SmoothSort ... option of HeapSort, which is O (N) when it is already sorted or almost sorted ... it is not so elegant, not too easy to understand but quickly ... read http://www.cs.utexas.edu/users/EWD/ewd07xx/EWD796a.PDF if you need something more complicated for the code.

+13
Oct 03 '11 at 5:10
source share

Hybrids in place of Quicksort-Heapsort are also really interesting, since most of them only need n * log n comparisons in the worst case (they are optimal with respect to the first term of the asymptotic behavior, so they avoid the worst-case scenarios). Quicksort), O (log n) is an extra space, and they retain at least “half” the good behavior of Quicksort with respect to an already ordered dataset. An extremely interesting algorithm is presented by Dickert and Weiss at http://arxiv.org/pdf/1209.4214v1.pdf :

  • Choose the p axis as the median of a random sample of sqrt (n) elements (this can be done in no more than 24 sqrt (n) comparisons using the Tarjan & co algorithm, or 5 sqrt (n) comparisons using a much more intricate spider). Schonhage factorial algorithm);
  • Divide your array into two parts, as in the first step of quick sorting;
  • Smallest heaps and using O (log n) extra bits to encode a heap in which each left child has a value greater than its sibling;
  • Recursively extract the heap root, sift the lacuna left by the root until it reaches the heap sheet, then fill the lacunae with the appropriate element taken from another part of the array;
  • Repeat on the remaining unordered part of the array (if p is selected as the exact median, there is no recursion at all).
+5
Jan 21 '13 at 15:23
source share

Comp. between quick sort and merge sort , since both are in-place sorting, there is a difference between the frost time in the wrost timeout mode for quick sorting of O(n^2) , and for sorting the heap still O(n*log(n)) and for the average amount of quick data sorting will be more useful. Since this is a randomized algorithm, so is the probability of getting the right ans. in less time it will depend on the position of the selected rotation element.

So <

Good challenge: sizes L and G seem less than 3s / 4

Bad call: one of L and G is larger than 3s / 4

for a small amount we can go to sort the insert and for a very large amount of data go to sort the heap.

+2
Sep 26 '12 at 17:18
source share

Well, if you go to the architecture level ... we use the cache data structure of the queue. What is ever available in the queue will be sorted. As in a quick way, we have no problem dividing the array by any length ... but in sorting the heap (using the array) it may happen that the parent element may not be in the additional array available in the cache, and then it should bring it to cache ... which is time consuming. The best speed! 😀

+2
May 20 '16 at 9:03 PM
source share

Heapsort has the advantage that O (n * log (n) has the worst case scenario), so in cases where quicksort is likely to work poorly (mostly sorted datasets in general), heapsort is much preferable.

+1
Mar 18
source share

Heapsort creates a heap and then re-extracts the maximum element. His worst case is O (n log n).

But if you see the worst case of quick sort , which is O (n2), you would realize that quick sort would not be a good choice for big data.

So sorting is an interesting thing; I believe that the reason many sorting algorithms live today is because they are all “best” in their best places. For example, sorting bubbles can perform quick sorting if the data is sorted. Or, if we know something about the elements that need to be sorted, then we can probably do better.

This may not answer your question directly, I thought that I would add my two cents.

+1
Mar 18 2018-10-18T00:
source share

Sort Heap is a safe bet when dealing with very large inputs. Asymptotic analysis shows that the worst-case growth order of Heapsort is Big-O(n logn) , which is better than Quicksort Big-O(n^2) in the worst case. However, Heapsort on most machines is somewhat slower than on well-implemented quick sorting. Heapsort is also not a stable sorting algorithm.

The reason why the hapsort in practice is slower than quicksort is due to the better locality of the link (" https://en.wikipedia.org/wiki/Locality_of_reference ") in quicksort, where the data items are located in relatively close storage locations. Systems that demonstrate high link locality are excellent candidates for optimizing performance. However, heap sorting deals with big jumps. This makes quicksort more conducive to small entrances.

+1
Nov 26 '15 at 17:02
source share

For me, there is a very fundamental difference between heapsort and quicksort: the latter uses recursion. In recursive algorithms, the heap grows with the number of recursions. It doesn’t matter if n is small, but now I sort two matrices with n = 10 ^ 9 !!. The program takes up almost 10 GB of RAM, and any additional memory will force my computer to start replacing with virtual memory. My disk is a RAM disk, but at the same time it changes it to a huge difference in speed . Thus, in a statpack encoded in C ++, which includes custom size matrices with a previously unknown size before the programmer and a non-parametric statistical sort sort, I prefer heapsort to avoid usage delays with very large data matrices.

+1
Apr 16 '16 at 0:08
source share

To answer the original question and point out some other comments here:

I simply compared choices, quicks, merges, and heaps to see how they stack up against each other. The answer is that they all have their drawbacks.

TL; DR: Fast is the best general purpose type (fast enough, stable, and mostly in place) Personally, I prefer sorting heaps, although I need a stable look.

Choice - N ^ 2 - This is really good only for less than 20 elements or so, then it surpassed. If your data is already sorted, or very, almost like that. N ^ 2 is very fast very fast.

Quickly, in my experience, is actually not so fast all the time. Bonuses for using quick sort as a general kind, although they are quite fast and stable. It is also an in-place algorithm, but, as it is usually implemented recursively, it will take up additional stack space. It also falls somewhere between O (n log n) and O (n ^ 2). It seems that time for some species confirms this, especially when values ​​fall into a limited range. This is faster than choosing to sort into 10,000,000 items, but slower than merging or heap.

Merge sorting is guaranteed to be O (n log n), since its sorting is data independent. He just does what he does, no matter what meanings you give him. It is also stable, but very large varieties can blow your stack if you are not careful in implementation. There are several complex merge implementations in place, but usually you need a different array at each level to combine your values. If these arrays live on the stack, you may run into problems.

The heap type is maximum O (n log n), but in many cases faster, depending on how far you have to move your values ​​to the depth of the heap log n. The heap can be easily implemented in place in the original array, therefore, it does not require additional memory, and it is repeated, so do not worry about stack overflow during recursion. A huge drawback in sorting a heap is that it is not a stable view, which means that you need it if you need it.

-one
Apr 09 '16 at 21:32
source share



All Articles