What sorting method to use: quicksort, sorting buckets, radix, ... for tiny data pairs? (C ++)

Question

What sorting method to use: quicksort, sorting buckets, radix, ... for tiny data pairs? (C ++)

I need to optimize some code that sorts a vector<pair<int, float >> a, where pairs need to be sorted by float value. The vector will have a length from 0 to 5. I searched for search queries and looked through C ++ sorting methods, but could not find any tests to sort tiny data sets. It is important for the system as quickly as possible, since it is used for a real-time block tracking system.

Regards, Pollux

+4

c ++ performance sorting benchmarking

pollux Jun 04 '10 at 7:10

source share

8 answers

Sort insert and bubble sort are great for tiny data pairs.

Another option is to hardcode the comparison logic using if pair expressions.
Check What is the fastest way to sort an array of 7 integers? for some ideas.

+6

Nick dandoulakis Jun 04 '10 at 7:28

source share

It makes no sense to read to read about the tests. You can read and compare the complexity algorithm (Big-O), since it depends only on the algorithms themselves, but benchmarking is something that depends on too many factors.

You need to benchmark yourself using the tools you use in an environment that is important to users of your application.

+5

sbi Jun 04 '10 at 7:28

source share

For the data you mentioned (0-5), use the STL sort methods. (historically based on qsort)

stable_sort - if you want to keep the order of duplicates. (historically merge sort)

+3

aJ. Jun 04 '10 at 7:14

source share

Is any specific reason you need this code optimized? For n == 5, in any case, everything will be done. Even Bogosort should have a reasonable lead time, since there are only 5! == 120 possible permutations in your data. Have you profiled your algorithm to find out if this is a bottleneck?

+3

Chinmay kanchi Jun 04 '10 at 7:23

source share

Use a somewhat annoying if set for a fast 5 element type, or if you want the sort to be a little slower, but much less of a headache, you can use network sort . Use this site with five entries to get an optimized sorting network. It even shows how you can do some parts in parallel, although this seems a bit excessive for only 5 inputs. This will require 9 comparisons, which is not bad. You also implement a sorting network with an if set, but the difference between the somewhat unpleasant ifs set, which I mentioned at the beginning, which is truly an optimal and optimal sorting network, is the number of swaps: the sorting network does not minimize the number of swaps.

+2

Justin peel Jun 04 '10 at 17:09

source share

If you are really sure (that is, you have measured) that this is a bottleneck and you need to optimize it, but simply that any sorting algorithm from STL will not be fast enough (and you measured it too), then maybe you know something else what can you use? Standard sorting algorithms are common and will work (quite well) for any data set: different numbers of elements, different ranges of values, etc. If you really need performance, the trick often uses additional information to create a more specialized algorithm.

Here you said one thing: there are 0-5 elements for sorting, as Nick D said in his answer, perhaps this will speed up the use of hard-coded if statements instead of typical loops or recursion of general sorting algorithms.

But maybe there is more? For example, are there any restrictions on float values that may occur?

+1

Thomas Padron-McCarthy Jun 04 '10 at 7:34

source share

Here is a routine for sorting 4 items in the optimal number of comparisons. I would post 5 elements, but stackoverflow restricts messages to 30,000 characters.

Whether this is actually the fastest depends on how much your processor cache memory might be needed, so the benchmark in a real program!

 /// Generated sort function for 4 arguments. template <class T> void DirectSort(T& a0, T& a1, T& a2, T& a3) { if (a0 < a1) { if (a0 < a2) { if (a0 < a3) { if (a1 < a2) { if (a1 < a3) { if (a3 < a2) { { T tmp(a2); a2 = a3; a3 = tmp; } } } else { // a1 >= a3 { T tmp(a1); a1 = a3; a3 = a2; a2 = tmp; } // a2 >= a3 } } else { // a1 >= a2 if (a1 < a3) { { T tmp(a1); a1 = a2; a2 = tmp; } // a2 < a3 } else { // a1 >= a3 if (a2 < a3) { { T tmp(a1); a1 = a2; a2 = a3; a3 = tmp; } } else { // a2 >= a3 { T tmp(a1); a1 = a3; a3 = tmp; } } } } } else { // a0 >= a3 if (a1 < a2) { { T tmp(a0); a0 = a3; a3 = a2; a2 = a1; a1 = tmp; } // a1 >= a3 // a2 >= a3 } else { // a1 >= a2 { T tmp(a0); a0 = a3; a3 = a1; a1 = tmp; } // a1 >= a3 // a2 >= a3 } } } else { // a0 >= a2 if (a0 < a3) { // a1 >= a2 if (a1 < a3) { { T tmp(a0); a0 = a2; a2 = a1; a1 = tmp; } // a2 < a3 } else { // a1 >= a3 { T tmp(a0); a0 = a2; a2 = a3; a3 = a1; a1 = tmp; } // a2 < a3 } } else { // a0 >= a3 // a1 >= a2 // a1 >= a3 if (a2 < a3) { { T tmp(a0); a0 = a2; a2 = tmp; } { T tmp(a1); a1 = a3; a3 = tmp; } } else { // a2 >= a3 { T tmp(a0); a0 = a3; a3 = a1; a1 = a2; a2 = tmp; } } } } } else { // a0 >= a1 if (a0 < a2) { if (a0 < a3) { { T tmp(a0); a0 = a1; a1 = tmp; } // a1 < a2 // a1 < a3 if (a3 < a2) { { T tmp(a2); a2 = a3; a3 = tmp; } } } else { // a0 >= a3 // a1 < a2 if (a1 < a3) { { T tmp(a0); a0 = a1; a1 = a3; a3 = a2; a2 = tmp; } // a2 >= a3 } else { // a1 >= a3 { T tmp(a0); a0 = a3; a3 = a2; a2 = tmp; } // a2 >= a3 } } } else { // a0 >= a2 if (a0 < a3) { if (a1 < a2) { { T tmp(a0); a0 = a1; a1 = a2; a2 = tmp; } // a1 < a3 // a2 < a3 } else { // a1 >= a2 { T tmp(a0); a0 = a2; a2 = tmp; } // a1 < a3 // a2 < a3 } } else { // a0 >= a3 if (a1 < a2) { if (a1 < a3) { if (a2 < a3) { { T tmp(a0); a0 = a1; a1 = a2; a2 = a3; a3 = tmp; } } else { // a2 >= a3 { T tmp(a0); a0 = a1; a1 = a3; a3 = tmp; } } } else { // a1 >= a3 { T tmp(a0); a0 = a3; a3 = tmp; } // a2 >= a3 } } else { // a1 >= a2 if (a1 < a3) { { T tmp(a0); a0 = a2; a2 = a3; a3 = tmp; } // a2 < a3 } else { // a1 >= a3 if (a2 < a3) { { T tmp(a0); a0 = a2; a2 = a1; a1 = a3; a3 = tmp; } } else { // a2 >= a3 { T tmp(a0); a0 = a3; a3 = tmp; } { T tmp(a1); a1 = a2; a2 = tmp; } } } } } } } } // DirectSort - 4 arguments.

0

Branko dimitrijevic Dec 11 '10 at 21:34

source share

utnapistim · Accepted Answer · 2010-06-04T08:02:31+0000

Firstly, premature optimization is the root of all evil . That is, first check your code and make sure that sorting is the one that actually takes the most time. If another part of your critical code takes 80% of the execution time, you will first get dramatic performance improvements by optimizing this.

Given you have 5 elements, the point is pretty much controversial. Any algorithm you use (except bogosort: D) should have a fairly constant runtime, unless you run the algorithm several hundred times per second or more.

Secondly, if you still want to optimize the search, if you know for sure that you will always have 5 elements, the best method is to hard code the comparisons (mathematically it can be proved that this method is optimal, ensuring the minimum number of operations performed in the worst case - we had it as a laboratory application at the university). The same thing happens if you have less than five elements, but the algorithm becomes prohibitive for writing and executing, if you have more than seven elements - the logic is confused for writing and subsequent work, so your code will be difficult to maintain.

Let me know if you need help writing the hardcoded algorithm itself (although I think you should find an implementation and theory online).

If you do not always have five numbers (or for some reason want to avoid hard comparisons), use the sorting provided by the library. This page concludes that stl sorting is optimal in terms of time (not just the number of operations performed). I do not know which implementation was used for the search.

What sorting method to use: quicksort, sorting buckets, radix, ... for tiny data pairs? (C ++)

More articles: