Profile Sort Algorithms for Partially Sorted Data

We know that several varieties, such as insertion sort, are great for arrays that are “mostly sorted” and not so large with random data.

Suppose we would like to profile the performance improvement / performance degradation of such an algorithm as to how the input data is “sorted”. What would be a good way to create an “more ordered” or “more random” array of elements? How can we measure the "sorting" of input?

+7
source share
3 answers

The number of inversions is a common measure of how sorted an array is.

A pair of elements (pi,pj) in a permutation p is called an inversion in a permutation if i<j and pi >pj . For example, in the permutation (3,1,2,5,4) are 3 inversions (3,1) , (3,2) and (5,4) .

The sorted array received 0 inverse, and the reverse sorted array received n * (n-1) / 2.

+10
source

You can create a "partially sorted" dataset by interrupting the modern Fisher-Yates shuffle running on an already ordered dataset.

In addition, if you need only a few essentially fixed sets of partially sorted data, you can generate a column graph of the position vs value column for each and just get the ball. This will allow you to quickly see the general randomness of the set, as well as things like the volume of a localized order.

+2
source

Also learn how to create a binary heap, and then use the array representation as a starting point. The binary heap implemented in the array is not sorted, but ordered. I think this will be considered "partially sorted."

0
source

All Articles