How does Timsort execute data in descending order?

From:

http://svn.python.org/projects/python/trunk/Objects/listsort.txt

and

http://en.wikipedia.org/wiki/Timsort

I see that Timsort has some optimizations for a0 > a1 > a2 > ... , but what about the following array:

10000,10000,9999,9999,9998,9998,....,9,9,8,8,7,7,6,6,5,5,4,4,3,3,2,2,1,1,0,0

What is the time efficiency for such an array?

(integers were used to simplify the example, stable sorting is required) I made some measurements and, it seems, such arrays are not “good” cases for Timsort.

in fact, TimSort in the JDK http://cr.openjdk.java.net/~martin/webrevs/openjdk7/timsort/raw_files/new/src/share/classes/java/util/TimSort.java has a method of "countRunAndMakeAscending"

 @SuppressWarnings("unchecked") private static int countRunAndMakeAscending(Object[] a, int lo, int hi) { assert lo < hi; int runHi = lo + 1; if (runHi == hi) return 1; // Find end of run, and reverse range if descending if (((Comparable) a[runHi++]).compareTo(a[lo]) < 0) { // Descending while(runHi < hi && ((Comparable) a[runHi]).compareTo(a[runHi - 1]) < 0) runHi++; reverseRange(a, lo, runHi); } else { // Ascending while (runHi < hi && ((Comparable) a[runHi]).compareTo(a[runHi - 1]) >= 0) runHi++; } return runHi - lo; } 

why not implement it differently:

 private static int countRunAndMakeAscending(Object[] a, int lo, int hi) { int runHi = lo; int lastEqual = lo; int ascending = 0; while (++runHi < hi) { int c = ((Comparable) a[runHi+1]).compareTo(a[runHi]); if (ascending == 0) { if (c != 0) { if (c > 0) { ascending = 1; } else { ascending = -1; reverseRange(a, lastEqual, runHi); lastEqual = runHi; } } } else if (ascending == 1) { if (c < 0) { return runHi - lo; } } else { if (c > 0) { reverseRange(a, lastEqual, runHi); reverseRange(a, lo, runHi); return runHi - lo; } else if (c < 0) { reverseRange(a, lastEqual, runHi); lastEqual = runHi; } } } if (ascending == -1) { reverseRange(a, lastEqual, runHi); reverseRange(a, lo, runHi); } return runHi - lo; } 

so that it will work fine with ascending order?

+7
sorting mergesort timsort
source share
1 answer

Yes.

Basically, he decided that “upward” really means “not to descend” without losing generality - if you have, for example, [5,5,4 3], he will simply break it at [5,5] (ascending) , and then [4.3] (descending) on ​​the next call.

As for why, I assume this is for simplicity: just try counting the number of reverseRange() calls in your code and in the original, and you will get this idea (I got it by noticing how long it took me to understand one version compared to the other :)

edit: WRONG WRONG! As Oscar Smith noted, the reason is to make timsort a robust sorting algorithm. If someone knows how to convey an undeserved reward ...

+2
source share

All Articles