Is Merge Sort Now Better Than QuickSort on Modern Machines?

Now that modern machines are multi-core, and we have support for SIMD instructions for Windows and Linux with SSE instructions, for example, should I switch to merge sort in my C / C ++ code and forget QuickSort? Theoretically, the reason for this is that merge sort will be better parallelized and use memory / disk more economically and therefore faster than QuickSort RAM, but I don't know. What does practical experience indicate?

I don’t want to analyze and check every time I sort something. I want to use one standard approach. This approach is currently QuickSort because it is the standard library sorting procedure. I want to know if there are others who switched to MergeSort and got the best results by doing this switch.

UPDATE ------------

Graham.Reeds answer to How big is the performance difference between std :: sort and std :: stable_sort in practice? shows that, in my opinion, switching to MergeSort / stablesort may be correct.

+4
source share
5 answers

Having received many unanswered answers, I spent several hours and did my own research. The result of this is that yes, merge sorting (and other related varieties) will be significantly faster due to less intensive memory usage and better parallelization / multi-core operation. In addition, there is a standard high-performance library from Intel called IPP that implements merge sortings for x86 machines. Turning to this library, it looks like I can significantly improve the sorting performance (and other vector type operations) for the programming types that I do.

+2
source

I do not think there is a final answer. In some cases, parallel brute force sorting capabilities can be faster. It is always important to analyze your specific case. Consider biton sorting, for example, if you have multiple cores and SIMD.

+1
source

Do I need to switch to merge sort in my C / C ++ code and forget QuickSort?

Sorry to say, but the question sounds like an attempt at premature optimization.

Theoretically, the reason is that merge sort will be better parallelized and use memory / disk more economically and therefore faster than QuickSort RAM, but I don't know. What does practical experience indicate?

In practice, you should always first create a profile, and then identify areas of optimization based on the results.

You probably won’t even have to change the sorting algorithm that you use, unless you do it over a sufficiently large data set for the results (or in a critical enough area of ​​your processing flow to matter).

Usually I use std :: sort, and if this is not enough (this has not happened for std::sort yet), I optimize the application flow and algorithms.

+1
source

The truth is that you must profile it yourself and see how it works for your application, data, environment, etc. This is essentially the answer to something like 99% of all profiling / performance / optimization questions on SO.

0
source

There are parallel sorting packages that scale according to the number of cores in the processor and are designed to use / optimize processing on each processor. I know that TBB (Threading Building Blocks) has a parallel_sort function, which is a sort with an average time complexity of 0 (n log n).

You can also implement some threads in Quick sort. Recursive functions convert to parallelism easily using parallel_for in TBB, or you can look at Cilk Plus, there are many threaded examples on the net.

0
source

All Articles