Is there any good radixsort implementation for float in C #

I have a data structure with a field of type float. The set of these structures must be sorted by float value. To do this, there is a numbering sort implementation.

If this does not happen, there is a quick way to access the exponent, sign and mantissa. Because if you sort the floats first on the mantissa, exponent and exponent for the last time. You sort the float in O (n).

+9
sorting floating-point c # algorithm radix-sort
source share
5 answers

Update:

I was very interested in this topic, so I sat down and implemented it (using this very fast and conservative implementation of memory ). I also read this one (thanks celion ) and found out that you don’t even need to break the floats into the mantissa and exponent to sort them. You just need to take the bits one to one and do an int-sort. You just need to take care of the negative values ​​that should be put back before the positive values ​​at the end of the algorithm (I did this in one step with the last iteration of the algorithm to save processor time).

So here is my float radixsort:

public static float[] RadixSort(this float[] array) { // temporary array and the array of converted floats to ints int[] t = new int[array.Length]; int[] a = new int[array.Length]; for (int i = 0; i < array.Length; i++) a[i] = BitConverter.ToInt32(BitConverter.GetBytes(array[i]), 0); // set the group length to 1, 2, 4, 8 or 16 // and see which one is quicker int groupLength = 4; int bitLength = 32; // counting and prefix arrays // (dimension is 2^r, the number of possible values of a r-bit number) int[] count = new int[1 << groupLength]; int[] pref = new int[1 << groupLength]; int groups = bitLength / groupLength; int mask = (1 << groupLength) - 1; int negatives = 0, positives = 0; for (int c = 0, shift = 0; c < groups; c++, shift += groupLength) { // reset count array for (int j = 0; j < count.Length; j++) count[j] = 0; // counting elements of the c-th group for (int i = 0; i < a.Length; i++) { count[(a[i] >> shift) & mask]++; // additionally count all negative // values in first round if (c == 0 && a[i] < 0) negatives++; } if (c == 0) positives = a.Length - negatives; // calculating prefixes pref[0] = 0; for (int i = 1; i < count.Length; i++) pref[i] = pref[i - 1] + count[i - 1]; // from a[] to t[] elements ordered by c-th group for (int i = 0; i < a.Length; i++){ // Get the right index to sort the number in int index = pref[(a[i] >> shift) & mask]++; if (c == groups - 1) { // We're in the last (most significant) group, if the // number is negative, order them inversely in front // of the array, pushing positive ones back. if (a[i] < 0) index = positives - (index - negatives) - 1; else index += negatives; } t[index] = a[i]; } // a[]=t[] and start again until the last group t.CopyTo(a, 0); } // Convert back the ints to the float array float[] ret = new float[a.Length]; for (int i = 0; i < a.Length; i++) ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0); return ret; } 

This is a bit slower than sorting int radix, due to copying the array at the beginning and end of the function, where floating-point values ​​are bitwise copied to int and vice versa. The whole function, however, is again O (n). In any case, much faster than sorting 3 times in a row, as you suggested. I no longer see a place for optimizations, but if someone does, feel free to let me know.

To sort in descending order, change this line at the very end:

 ret[i] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0); 

to that:

 ret[a.Length - i - 1] = BitConverter.ToSingle(BitConverter.GetBytes(a[i]), 0); 

Measurement:

I set up a small test containing all special cases with floating point (NaN, + / -Inf, min / max value, 0) and random numbers. It sorts exactly the same order as Linq or Array.Sort sorts floating point values:

 NaN -> -Inf -> Min -> Negative Nums -> 0 -> Positive Nums -> Max -> +Inf 

So, I conducted a test with a huge array of 10 million numbers:

 float[] test = new float[10000000]; Random rnd = new Random(); for (int i = 0; i < test.Length; i++) { byte[] buffer = new byte[4]; rnd.NextBytes(buffer); float rndfloat = BitConverter.ToSingle(buffer, 0); switch(i){ case 0: { test[i] = float.MaxValue; break; } case 1: { test[i] = float.MinValue; break; } case 2: { test[i] = float.NaN; break; } case 3: { test[i] = float.NegativeInfinity; break; } case 4: { test[i] = float.PositiveInfinity; break; } case 5: { test[i] = 0f; break; } default: { test[i] = test[i] = rndfloat; break; } } } 

And the time for various sorting algorithms stopped:

 Stopwatch sw = new Stopwatch(); sw.Start(); float[] sorted1 = test.RadixSort(); sw.Stop(); Console.WriteLine(string.Format("RadixSort: {0}", sw.Elapsed)); sw.Reset(); sw.Start(); float[] sorted2 = test.OrderBy(x => x).ToArray(); sw.Stop(); Console.WriteLine(string.Format("Linq OrderBy: {0}", sw.Elapsed)); sw.Reset(); sw.Start(); Array.Sort(test); float[] sorted3 = test; sw.Stop(); Console.WriteLine(string.Format("Array.Sort: {0}", sw.Elapsed)); 

And the conclusion was (update: now it starts with the release of the assembly, and not debugging):

 RadixSort: 00:00:03.9902332 Linq OrderBy: 00:00:17.4983272 Array.Sort: 00:00:03.1536785 

about four times faster than Link. It's not bad. But still not as fast as Array.Sort , but also not much worse. But I was really surprised by this: I expected it to be a bit slower than Linq on very small arrays. But then I ran a test with 20 elements:

 RadixSort: 00:00:00.0012944 Linq OrderBy: 00:00:00.0072271 Array.Sort: 00:00:00.0002979 

and even this time my Radixsort is faster than Linq, but much slower than sorting arrays. :)

Update 2:

I took a few more measurements and found some interesting things: longer group length constants mean less iterations and more memory usage. If you use a group of 16 bits in length (only 2 iterations), you will have huge overhead when sorting small arrays, but you can split Array.Sort if we are talking about arrays larger than 100 thousand elements, even if not very many. The axis of the diagrams is logarithm:

comparison chart
(source: daubmeier.de )

+18
source share

There is a good explanation of how to do radix sorting on floats here: http://www.codercorner.com/RadixSortRevisited.htm

If all of your values ​​are positive, you can avoid using a binary representation; the link explains how to handle negative values.

+1
source share

You can use the unsafe block for memcpy or the alias float * for uint * to extract the bits.

+1
source share

Having performed some bizarre casts and replacements of arrays instead of copying this version, it is 2 times faster for 10M numbers than the original of Philippe Dobmeier with a group length of 8. This is 3 times faster than for an Array.Sort array.

  static public void RadixSortFloat(this float[] array, int arrayLen = -1) { // Some use cases have an array that is longer as the filled part which we want to sort if (arrayLen < 0) arrayLen = array.Length; // Cast our original array as long Span<float> asFloat = array; Span<int> a = MemoryMarshal.Cast<float, int>(asFloat); // Create a temp array Span<int> t = new Span<int>(new int[arrayLen]); // set the group length to 1, 2, 4, 8 or 16 and see which one is quicker int groupLength = 8; int bitLength = 32; // counting and prefix arrays // (dimension is 2^r, the number of possible values of a r-bit number) var dim = 1 << groupLength; int groups = bitLength / groupLength; if (groups % 2 != 0) throw new Exception("groups must be even so data is in original array at end"); var count = new int[dim]; var pref = new int[dim]; int mask = (dim) - 1; int negatives = 0, positives = 0; // counting elements of the 1st group incuding negative/positive for (int i = 0; i < arrayLen; i++) { if (a[i] < 0) negatives++; count[(a[i] >> 0) & mask]++; } positives = arrayLen - negatives; int c; int shift; for (c = 0, shift = 0; c < groups - 1; c++, shift += groupLength) { CalcPrefixes(); var nextShift = shift + groupLength; // for (var i = 0; i < arrayLen; i++) { var ai = a[i]; // Get the right index to sort the number in int index = pref[( ai >> shift) & mask]++; count[( ai>> nextShift) & mask]++; t[index] = ai; } // swap the arrays and start again until the last group var temp = a; a = t; t = temp; } // Last round CalcPrefixes(); for (var i = 0; i < arrayLen; i++) { var ai = a[i]; // Get the right index to sort the number in int index = pref[( ai >> shift) & mask]++; // We're in the last (most significant) group, if the // number is negative, order them inversely in front // of the array, pushing positive ones back. if ( ai < 0) index = positives - (index - negatives) - 1; else index += negatives; // t[index] = ai; } void CalcPrefixes() { pref[0] = 0; for (int i = 1; i < dim; i++) { pref[i] = pref[i - 1] + count[i - 1]; count[i - 1] = 0; } } } 
+1
source share

I think your best bet is if the values ​​are not too close and there is a reasonable requirement for accuracy, you can simply use the actual floating point numbers before and after the decimal point to sort.

For example, you can simply use the first 4 decimal places (whether they are 0 or not) to sort.

0
source share

All Articles