Improve performance when comparing two integer arrays of different lengths

I have a bottleneck (or at least an area, I think I can do better) with this comparator, which is basically an ordinal string matching, but works against integers (usually I don’t think it has value) of arrays.

Arrays can be of different lengths, but the length will only matter if the elements are [0..n], where n is the length of the shortest match of the array. In this case, a longer array is considered "large."

1 2 3 <1 2 4

1 2 3 5 <1 2 4

1 2 5 3> 1 2 4

1 2 3 4> 1 2 3

public int Compare(ushort[] x, ushort[] y) { int pos = 0; int len = Math.Min(x.Length, y.Length); while (pos < len && x[pos] == y[pos]) pos++; return pos < len ? x[pos].CompareTo(y[pos]) : x.Length.CompareTo(y.Length); } 

Any ideas on how this can be optimized?

Update

In response to the commentator about what I'm actually doing here: I understand that I have long asked a question related to this, which shows what I am doing in context. The only significant difference is that now I use the ushort array instead of the string for keys, since it is much more compact.

Using the entire path as a key allows me to use partial keys to get representations from a sorted set, which gives high performance for subsets of queries. I am trying to improve performance when creating an index. Data structure for indexed subset searches

By the way, I am extremely impressed with the answers here so far, I have been asking a lot of questions about SO over the years, but this is by far the most thoughtful and interesting collection of answers I have ever seen. I'm still not sure what the correct answer is to my specific problem (that is, millions of comparisons of short arrays), but each of them taught me what I did not know.

+4
source share
4 answers

Here is what I came up with, I tested about 16 mil (2 ^ 24) using a combination of your code and some parallel code.

 public int CompareParallel(ushort[]x, ushort[] y, int len, int segLen) { int compareArrLen = ( len / segLen ) + 1; int [ ] compareArr = new int [ compareArrLen ]; Parallel.For ( 0 , compareArrLen , new Action<int , ParallelLoopState> ( ( i , state ) => { if ( state.LowestBreakIteration.HasValue && state.LowestBreakIteration.Value < i ) return; int segEnd = ( i + 1 ) * segLen; int k = len < segEnd ? len : segEnd; for ( int j = i * segLen ; j < k ; j++ ) if ( x [ j ] != y [ j ] ) { compareArr [ i ] = ( x [ j ].CompareTo ( y [ j ] ) ); state.Break ( ); return; } } ) ); int r = compareArrLen - 1; while ( r >= 0 ) { if ( compareArr [ r ] != 0 ) return compareArr [ r ]; r--; } return x.Length.CompareTo ( y.Length ); } public int CompareSequential ( ushort [ ] x , ushort [ ] y, int len ) { int pos = 0; while ( pos < len && x [ pos ] == y [ pos ] ) pos++; return pos < len ? x [ pos ].CompareTo ( y [ pos ] ) : x.Length.CompareTo ( y.Length ); } public int Compare( ushort [ ] x , ushort [ ] y ) { //determined through testing to be the best on my machine const int cutOff = 4096; int len = x.Length < y.Length ? x.Length : y.Length; //check if len is above a specific threshold //and if first and a number in the middle are equal //chose equal because we know that there is a chance that more //then 50% of the list is equal, which would make the overhead //worth the effort if ( len > cutOff && x [ len - 1 ] == y [ len - 1 ] && x [ len/2 ] == y [ len/2 ] ) { //segment length was determined to be best through testing //at around 8% of the size of the array seemed to have the //on my machine return CompareParallel ( x , y , len , (len / 100)*8 ); } return CompareSequential ( x , y, len ); } 

Here is the test I wrote:

 class Program { [Flags] private enum InfoLevel:byte { Detail=0x01, Summary=0x02 } private static InfoLevel logLevel = InfoLevel.Summary; private static void LogDetail ( string content ) { LogInfo ( InfoLevel.Detail,content ); } private static void LogSummary ( string content ) { LogInfo ( InfoLevel.Summary , content ); } private static void LogInfo ( InfoLevel level , string content ) { if ( ( level & logLevel ) == level ) Console.WriteLine ( content ); } private static void LogInfo ( InfoLevel level , string format, params object[] arg ) { if ( ( level & logLevel ) == level ) Console.WriteLine ( format:format, arg:arg ); } private static void LogDetail ( string format , params object [ ] arg ) { LogInfo ( InfoLevel.Detail , format, arg ); } private static void LogSummary ( string format , params object [ ] arg ) { LogInfo ( InfoLevel.Summary , format, arg ); } const string _randTestResultHeader = "\r\nRandom Array Content\r\n"; const string _equalArrayResultHeader = "Only Length Different\r\n\r\n"; const string _summaryTestResultsHeader = "Size\t\tOrig Elps\tPara Elps\tComp Elps\r\n"; const string _summaryBodyContent = "{0}\t\t{1:0.0000}\t\t{2:0.0000}\t\t{3:0.00000}\r\n"; static void Main ( string [ ] args ) { Console.SetOut(new StreamWriter(File.Create("out.txt"))); int segLen = 0; int segPercent = 7; Console.WriteLine ( "Algorithm Test, Time results in milliseconds" ); for ( ; segPercent < 13; segPercent ++ ) { Console.WriteLine ( "Test Run with parallel Dynamic segment size at {0}%" +" of Array Size (Comp always at 8%)\r\n" , segPercent); StringBuilder _aggrRandResults = new StringBuilder ( ); StringBuilder _aggrEqualResults = new StringBuilder ( ); _aggrRandResults.Append ( _randTestResultHeader ); _aggrEqualResults.Append ( _equalArrayResultHeader ); _aggrEqualResults.Append ( _summaryTestResultsHeader ); _aggrRandResults.Append ( _summaryTestResultsHeader ); for ( int i = 10 ; i < 25 ; i++ ) { int baseLen = ( int ) Math.Pow ( 2 , i ); segLen = ( baseLen / 100 ) * segPercent; var testName = "Equal Length "; var equalTestAverage = RandomRunTest ( testName , baseLen , baseLen, segLen ); testName = "Left Side Larger"; var lslargerTestAverage=RandomRunTest(testName,baseLen+10, baseLen, segLen ); testName = "Right Side Larger"; var rslargerTestAverage = RandomRunTest ( testName , baseLen , baseLen + 10, segLen ); double [ ] completelyRandomTestAvg = new double [ 3 ]; for ( int l = 0 ; l < completelyRandomTestAvg.Length ; l++ ) completelyRandomTestAvg [ l ] = ( equalTestAverage [ l ] + lslargerTestAverage [ l ] + rslargerTestAverage [ l ] ) / 3; LogDetail ( "\r\nRandom Test Results:" ); LogDetail ("Original Composite Test Average: {0}" , completelyRandomTestAvg [ 0 ] ); LogDetail ( "Parallel Composite Test Average: {0}" , completelyRandomTestAvg [ 1 ] ); _aggrRandResults.AppendFormat ( _summaryBodyContent , baseLen , completelyRandomTestAvg [ 0 ] , completelyRandomTestAvg [ 1 ] , completelyRandomTestAvg [ 2 ]); testName = "Equal Len And Values"; var equalEqualTest = EqualTill ( testName , baseLen , baseLen, segLen ); testName = "LHS Larger"; var equalLHSLargerTest = EqualTill ( testName , baseLen + 10 , baseLen, segLen ); testName = "RHS Larger"; var equalRHSLargerTest = EqualTill ( testName , baseLen , baseLen + 10, segLen ); double [ ] mostlyEqualTestAvg = new double [ 3 ]; for ( int l = 0 ; l < mostlyEqualTestAvg.Length ; l++ ) mostlyEqualTestAvg [ l ] = ( ( equalEqualTest [ l ] + equalLHSLargerTest [ l ] + equalRHSLargerTest [ l ] ) / 3 ); LogDetail( "\r\nLength Different Test Results" ); LogDetail( "Original Composite Test Average: {0}" , mostlyEqualTestAvg [ 0 ] ); LogDetail( "Parallel Composite Test Average: {0}" , mostlyEqualTestAvg [ 1 ] ); _aggrEqualResults.AppendFormat ( _summaryBodyContent , baseLen , mostlyEqualTestAvg [ 0 ] , mostlyEqualTestAvg [ 1 ] , mostlyEqualTestAvg [ 2 ]); } LogSummary ( _aggrRandResults.ToString() + "\r\n"); LogSummary ( _aggrEqualResults.ToString()+ "\r\n"); } Console.Out.Flush ( ); } private const string _testBody = "\r\n\tOriginal:: Result:{0}, Elapsed:{1}" +"\r\n\tParallel:: Result:{2}, Elapsed:{3}" +"\r\n\tComposite:: Result:{4}, Elapsed:{5}"; private const string _testHeader = "\r\nTesting {0}, Array Lengths: {1}, {2}"; public static double[] RandomRunTest(string testName, int shortArr1Len, int shortArr2Len, int parallelSegLen) { var shortArr1 = new ushort [ shortArr1Len ]; var shortArr2 = new ushort [ shortArr2Len ]; double [ ] avgTimes = new double [ 3 ]; LogDetail ( _testHeader , testName , shortArr1Len , shortArr2Len ) ; for ( int i = 0 ; i < 10 ; i++ ) { int arrlen1 = shortArr1.Length , arrlen2 = shortArr2.Length; double[] currResults = new double [ 3 ]; FillCompareArray ( shortArr1 , shortArr1.Length ); FillCompareArray ( shortArr2 , shortArr2.Length ); var sw = new Stopwatch ( ); //Force Garbage Collection //to avoid having it effect //the test results this way //test 2 may have to garbage //collect due to running second GC.Collect ( ); sw.Start ( ); int origResult = Compare ( shortArr1 , shortArr2 ); sw.Stop ( ); currResults[0] = sw.Elapsed.TotalMilliseconds; sw.Reset ( ); GC.Collect ( ); sw.Start ( ); int parallelResult = CompareParallelOnly ( shortArr1 , shortArr2, parallelSegLen ); sw.Stop ( ); currResults [ 1 ] = sw.Elapsed.TotalMilliseconds; sw.Reset ( ); GC.Collect ( ); sw.Start ( ); int compositeResults = CompareComposite ( shortArr1 , shortArr2 ); sw.Stop ( ); currResults [ 2 ] = sw.Elapsed.TotalMilliseconds; LogDetail ( _testBody, origResult , currResults[0] , parallelResult , currResults[1], compositeResults, currResults[2]); for ( int l = 0 ; l < currResults.Length ; l++ ) avgTimes [ l ] = ( ( avgTimes[l]*i)+currResults[l]) / ( i + 1 ); } LogDetail ( "\r\nAverage Run Time Original: {0}" , avgTimes[0]); LogDetail ( "Average Run Time Parallel: {0}" , avgTimes[1]); LogDetail ( "Average Run Time Composite: {0}" , avgTimes [ 2 ] ); return avgTimes; } public static double [ ] EqualTill ( string testName, int shortArr1Len , int shortArr2Len, int parallelSegLen) { const string _testHeader = "\r\nTesting When Array Difference is " +"Only Length({0}), Array Lengths: {1}, {2}"; int baseLen = shortArr1Len > shortArr2Len ? shortArr2Len : shortArr1Len; var shortArr1 = new ushort [ shortArr1Len ]; var shortArr2 = new ushort [ shortArr2Len ]; double [ ] avgTimes = new double [ 3 ]; LogDetail( _testHeader , testName , shortArr1Len , shortArr2Len ); for ( int i = 0 ; i < 10 ; i++ ) { FillCompareArray ( shortArr1 , shortArr1Len); Array.Copy ( shortArr1 , shortArr2, baseLen ); double [ ] currElapsedTime = new double [ 3 ]; var sw = new Stopwatch ( ); //See previous explaination GC.Collect ( ); sw.Start ( ); int origResult = Compare ( shortArr1 , shortArr2 ); sw.Stop ( ); currElapsedTime[0] = sw.Elapsed.TotalMilliseconds; sw.Reset ( ); GC.Collect ( ); sw.Start ( ); int parallelResult = CompareParallelOnly ( shortArr1, shortArr2, parallelSegLen ); sw.Stop ( ); currElapsedTime[1] = sw.Elapsed.TotalMilliseconds; sw.Reset ( ); GC.Collect ( ); sw.Start ( ); var compositeResult = CompareComposite ( shortArr1 , shortArr2 ); sw.Stop ( ); currElapsedTime [ 2 ] = sw.Elapsed.TotalMilliseconds; LogDetail ( _testBody , origResult , currElapsedTime[0] , parallelResult , currElapsedTime[1], compositeResult,currElapsedTime[2]); for ( int l = 0 ; l < currElapsedTime.Length ; l++ ) avgTimes [ l ] = ( ( avgTimes [ l ] * i ) + currElapsedTime[l])/(i + 1); } LogDetail ( "\r\nAverage Run Time Original: {0}" , avgTimes [ 0 ] ); LogDetail ( "Average Run Time Parallel: {0}" , avgTimes [ 1 ] ); LogDetail ( "Average Run Time Composite: {0}" , avgTimes [ 2 ] ); return avgTimes; } static Random rand = new Random ( ); public static void FillCompareArray ( ushort[] compareArray, int length ) { var retVals = new byte[length]; ( rand ).NextBytes ( retVals ); Array.Copy ( retVals , compareArray , length); } public static int CompareParallelOnly ( ushort [ ] x , ushort[] y, int segLen ) { int len = x.Length<y.Length ? x.Length:y.Length; int compareArrLen = (len/segLen)+1; int[] compareArr = new int [ compareArrLen ]; Parallel.For ( 0 , compareArrLen , new Action<int , ParallelLoopState> ( ( i , state ) => { if ( state.LowestBreakIteration.HasValue && state.LowestBreakIteration.Value < i ) return; int segEnd = ( i + 1 ) * segLen; int k = len<segEnd?len:segEnd; for ( int j = i * segLen ; j < k ; j++ ) if ( x [ j ] != y [ j ] ) { compareArr [ i ] = ( x [ j ].CompareTo ( y [ j ] ) ); state.Break ( ); return; } } ) ); int r=compareArrLen-1; while ( r >= 0 ) { if ( compareArr [ r ] != 0 ) return compareArr [ r ]; r--; } return x.Length.CompareTo ( y.Length ); } public static int Compare ( ushort [ ] x , ushort [ ] y ) { int pos = 0; int len = Math.Min ( x.Length , y.Length ); while ( pos < len && x [ pos ] == y [ pos ] ) pos++; return pos < len ? x [ pos ].CompareTo ( y [ pos ] ) : x.Length.CompareTo ( y.Length ); } public static int CompareParallel ( ushort[] x, ushort[] y, int len, int segLen ) { int compareArrLen = ( len / segLen ) + 1; int [ ] compareArr = new int [ compareArrLen ]; Parallel.For ( 0 , compareArrLen , new Action<int , ParallelLoopState> ( ( i , state ) => { if ( state.LowestBreakIteration.HasValue && state.LowestBreakIteration.Value < i ) return; int segEnd = ( i + 1 ) * segLen; int k = len < segEnd ? len : segEnd; for ( int j = i * segLen ; j < k ; j++ ) if ( x [ j ] != y [ j ] ) { compareArr [ i ] = ( x [ j ].CompareTo ( y [ j ] ) ); state.Break ( ); return; } } ) ); int r = compareArrLen - 1; while ( r >= 0 ) { if ( compareArr [ r ] != 0 ) return compareArr [ r ]; r--; } return x.Length.CompareTo ( y.Length ); } public static int CompareSequential(ushort [ ] x , ushort [ ] y, int len) { int pos = 0; while ( pos < len && x [ pos ] == y [ pos ] ) pos++; return pos < len ? x [ pos ].CompareTo ( y [ pos ] ) : x.Length.CompareTo ( y.Length ); } public static int CompareComposite ( ushort [ ] x , ushort [ ] y ) { const int cutOff = 4096; int len = x.Length < y.Length ? x.Length : y.Length; if ( len > cutOff && x [ len - 1 ] == y [ len - 1 ] && x [ len/2 ] == y [ len/2 ] ) return CompareParallel ( x , y , len , (len / 100)*8 ); return CompareSequential ( x , y, len ); } } 

Note:

Make sure you built with optimized code, the results were very different, when I did not include this step, this made the parallel code look like a much bigger improvement than it actually was.

The results I got were about 33% shorter than the runtime for very long sets of equal numbers. It still grows linearly with increasing input, but slower. It also starts slower for small datasets (less than 4092 on my machine), but usually the amount of time was quite small (0.001 ms on my machine), which would be wise to use it if you really get a large, almost equal array.

+3
source

It probably won't make much difference, but you can set that the last element will be different to get rid of the pos < len check in the while loop. And a pretty trivial pos++ to ++pos .

 public int Compare(ushort[] x, ushort[] y) { int pos = 0; int len = Math.Min(x.Length, y.Length); // the below is probably not worth it for less than 5 (or so) elements, // so just do the old way if (len < 5) { while (pos < len && x[pos] == y[pos]) ++pos; return pos < len ? x[pos].CompareTo(y[pos]) : x.Length.CompareTo(y.Length); } ushort lastX = x[len-1]; bool lastSame = true; if (x[len-1] == y[len-1]) --x[len-1]; // can be anything else else lastSame = false; while (x[pos] == y[pos]) ++pos; return pos < len-1 ? x[pos].CompareTo(y[pos]) : lastSame ? x.Length.CompareTo(y.Length) : lastX.CompareTo(y[len-1]); } 

EDIT: You will only get performance gains when many elements from the very beginning will be the same (and it will be worse if there is an early difference, as pkuderov mentioned).

+1
source

Sorry for the long answer, but the question caused me so much interest that I spent a couple of hours researching and I want to share the results. I have written several test event generators and a performance tester.

What is there:

  • Generate completely random arrays
  • Check runtime of 3 comparison methods
  • Generate arrays with a high likelihood of similarity.
  • Check lead time.

I used 3 methods

  • Op's
  • My - Idea - change two indexing operations to pointer increments
  • Dukeling - Idea - Remove Unnecessary Comparison

I started with short arrays (length 5-15)

Method 1 was the fastest on both test variations (it was predicted by Pkuder)

If we increase the size of the array, we will change the situation.

This I got when the length of the array is between 500 and 1500

 Generating test cases ... Done. (5258 milliseconds) Compare1 took 18 milliseconds Compare2 took 18 milliseconds Compare3 took 33 milliseconds Generating 'similar' test cases ... Done. (5081 milliseconds) Compare1 took 359 milliseconds Compare2 took 313 milliseconds Compare3 took 295 milliseconds 

Thus, we have a slight gain in method 2 compared to 1 and an even weaker gain in method 3 compared to 2;

Resolution:

1. If your arrays are short enough and / or there is a high probability the difference begins with small index values ​​- there are not many do (with the proposed methods)
2. Otherwise, you can try some combination of methods 2 and 3.

Code:

 using System; using System.Diagnostics; namespace ConsoleExamples { class ArrayComparePerformance { static readonly int testArraysNum = 100000; static readonly int maxArrayLen = 1500; static readonly int minArrayLen = 500; static readonly int maxValue = 10; public static void RunTest() { //Generate random arrays; ushort[][] a = new ushort[testArraysNum][]; ushort[][] b = new ushort[testArraysNum][]; Random rand = new Random(); Console.WriteLine("Generating test cases ... " ); Stopwatch sw = new Stopwatch(); sw.Start(); for (int i = 0; i < testArraysNum; i++) { int len = rand.Next(maxArrayLen) + 1; a[i] = new ushort[len]; for (int j = 0; j < len; j++) { a[i][j] = (ushort) rand.Next(maxValue); } len = rand.Next(maxArrayLen) + 1; b[i] = new ushort[len]; for (int j = 0; j < len; j++) { b[i][j] = (ushort) rand.Next(maxValue); } } sw.Stop(); Console.WriteLine("Done. ({0} milliseconds)", sw.ElapsedMilliseconds); //compare1 sw.Restart(); for (int i = 0; i < testArraysNum; i++) { int result = Compare1(a[i], b[i]); } sw.Stop(); Console.WriteLine("Compare1 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds"); //compare2 sw.Restart(); for (int i = 0; i < testArraysNum; i++) { int result = Compare2(a[i], b[i]); } sw.Stop(); Console.WriteLine("Compare2 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds"); //compare3 sw.Restart(); for (int i = 0; i < testArraysNum; i++) { int result = Compare3(a[i], b[i]); } sw.Stop(); Console.WriteLine("Compare3 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds"); //Generate "similar" arrays; Console.WriteLine("Generating 'similar' test cases ... "); sw.Restart(); for (int i = 0; i < testArraysNum; i++) { int len = rand.Next(maxArrayLen - minArrayLen) + minArrayLen -1; a[i] = new ushort[len]; for (int j = 0; j < len; j++) { if (j < len/2) a[i][j] = (ushort)j; else a[i][j] = (ushort)(rand.Next(2) + j); } len = rand.Next(maxArrayLen - minArrayLen) + minArrayLen - 1; b[i] = new ushort[len]; for (int j = 0; j < len; j++) { if (j < len/2) b[i][j] = (ushort)j; else b[i][j] = (ushort)(rand.Next(2) + j); } } sw.Stop(); Console.WriteLine("Done. ({0} milliseconds)", sw.ElapsedMilliseconds); //compare1 sw.Restart(); for (int i = 0; i < testArraysNum; i++) { int result = Compare1(a[i], b[i]); } sw.Stop(); Console.WriteLine("Compare1 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds"); //compare2 sw.Restart(); for (int i = 0; i < testArraysNum; i++) { int result = Compare2(a[i], b[i]); } sw.Stop(); Console.WriteLine("Compare2 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds"); //compare3 sw.Restart(); for (int i = 0; i < testArraysNum; i++) { int result = Compare3(a[i], b[i]); } sw.Stop(); Console.WriteLine("Compare3 took " + sw.ElapsedMilliseconds.ToString() + " milliseconds"); Console.ReadKey(); } public static int Compare1(ushort[] x, ushort[] y) { int pos = 0; int len = Math.Min(x.Length, y.Length); while (pos < len && x[pos] == y[pos]) pos++; return pos < len ? x[pos].CompareTo(y[pos]) : x.Length.CompareTo(y.Length); } public unsafe static int Compare2(ushort[] x, ushort[] y) { int pos = 0; int len = Math.Min(x.Length, y.Length); fixed (ushort* fpx = &x[0], fpy = &y[0]) { ushort* px = fpx; ushort* py = fpy; while (pos < len && *px == *py) { px++; py++; pos++; } } return pos < len ? x[pos].CompareTo(y[pos]) : x.Length.CompareTo(y.Length); } public static int Compare3(ushort[] x, ushort[] y) { int pos = 0; int len = Math.Min(x.Length, y.Length); // the below is probably not worth it for less than 5 (or so) elements, // so just do the old way if (len < 5) { while (pos < len && x[pos] == y[pos]) ++pos; return pos < len ? x[pos].CompareTo(y[pos]) : x.Length.CompareTo(y.Length); } ushort lastX = x[len - 1]; bool lastSame = true; if (x[len - 1] == y[len - 1]) --x[len - 1]; // can be anything else else lastSame = false; while (x[pos] == y[pos]) ++pos; return pos < len - 1 ? x[pos].CompareTo(y[pos]) : lastSame ? x.Length.CompareTo(y.Length) : lastX.CompareTo(y[len - 1]); } } } 
+1
source

Only some ideas (maybe wrong, need a test):

. . A higher element type (e.g. int for x32 or long for x64 - let this type of TLong ) can provide better performance.
If you pack several ushort elements in the TLong type (in the big-endian order!), You can compare several elements at the same time.
But you will need to take care of the last element of the new array [of type TLong ] if it is not full. There may be some tricky cases. I don’t see yet, but I’m not sure.

Second. More! In some cases, we can pack more items of origin in an element of type TLong .
Let's return to the original arrays of the ushort type: suppose that K - the largest number existed in all (i.e. all the paths you want to sort!) ushort your arrays (i.e. for each number t stored in each ushort really: t <= K ). Suppose that each t is simply a β€œdigit” in the base system K
This means that each path in your graph (i.e. each ushort array) defines only a number in this number system. So instead of manipulating ushort arrays ushort you need to do smth. In the following way:

  • Determine which greatest degree of K corresponds to the type of TLong - suppose p :

     int p = 0; while (Math.Exp(K, p) <= TLong.MaxValue) p++; 
  • Take the i-th element p the ushort array and calculate the corresponding number in the base- K numeric system and save it as the i-th element of an array of type TLong :

     List<TLong> num = new List<TLong>(); int i = 0; while (p * i < a.Length) { TLong y = 0; //transform from base-10 to base-K in chunks of p elements for (int j = p * i; j < Math.Min(p * (i + 1), a.Length); j++) y = y * K + a[j]; num.Add(sum); i++; } TLong[] result = num.ToArray(); 
  • This is a dynamic pre-conversion, so for different HTML documents, K may be different, and for the case when K much less than 255, it will be faster than the first idea. In addition, the pre-calculated conversion has linear complexity, so it will not greatly affect your performance.

  • Typically, you convert your arrays to large numbers in base- K num. Sys. stored in an array. What is it!
  • And there is no need to change the initial sorting algorithm with improvements from other comments (check this anyway - I could be wrong!)

Hope this helps you.

+1
source

All Articles