Why is this merge version faster

Based on this answer, here are two versions of the merge function used for mergesort. Could you help me understand why the second is much faster. I tested it for a list of 50,000, and the second one is 8 times faster ( Gist ).

def merge1(left, right): i = j = inv = 0 merged = [] while i < len(left) and j < len(right): if left[i] <= right[j]: merged.append(left[i]) i += 1 else: merged.append(right[j]) j += 1 inv += len(left[i:]) merged += left[i:] merged += right[j:] return merged, inv 

.

 def merge2(array1, array2): inv = 0 merged_array = [] while array1 or array2: if not array1: merged_array.append(array2.pop()) elif (not array2) or array1[-1] > array2[-1]: merged_array.append(array1.pop()) inv += len(array2) else: merged_array.append(array2.pop()) merged_array.reverse() return merged_array, inv 

Here is the sort function:

 def _merge_sort(list, merge): len_list = len(list) if len_list < 2: return list, 0 middle = len_list / 2 left, left_inv = _merge_sort(list[:middle], merge) right, right_inv = _merge_sort(list[middle:], merge) l, merge_inv = merge(left, right) inv = left_inv + right_inv + merge_inv return l, inv 

.

 import numpy.random as nprnd test_list = nprnd.randint(1000, size=50000).tolist() test_list_tmp = list(test_list) merge_sort(test_list_tmp, merge1) test_list_tmp = list(test_list) merge_sort(test_list_tmp, merge2) 
+7
source share
3 answers

A similar answer as kreativitea above, but with additional information (I think!)

Profiling actual merge functions to merge two 50K arrays,

merge 1

  311748 function calls in 15.363 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 15.363 15.363 <string>:1(<module>) 1 15.322 15.322 15.362 15.362 merge.py:3(merge1) 221309 0.030 0.000 0.030 0.000 {len} 90436 0.010 0.000 0.010 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 

merge2

  250004 function calls in 0.104 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 0.104 0.104 <string>:1(<module>) 1 0.074 0.074 0.103 0.103 merge.py:20(merge2) 50000 0.005 0.000 0.005 0.000 {len} 100000 0.010 0.000 0.010 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 100000 0.014 0.000 0.014 0.000 {method 'pop' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'reverse' of 'list' objects} 

So, for merge1 it is 221309 len , 90436 append and takes 15.363 seconds. Therefore, for merge2 it is 50,000 len , 100,000 append and 100000 pop and takes 0.104 seconds.

len and append pop are all O (1) (more info here ), so these profiles don’t show what actually takes time, since this is what happens, it should be faster, but only ~ 20%.

Well, the reason is actually pretty obvious if you just read the code:

The first method has a line like this:

 inv += len(left[i:]) 

therefore, every time it is called, it must rebuild the array. If you comment out this line (or just replace it with inv += 1 or something else), it will become faster than another method. This is the only line responsible for increasing the time.

Having noticed that this is the reason, the problem can be fixed by improving the code; change it to this for speed. After that it will be faster than merge2

 inv += len(left) - i 

Update it to this:

 def merge3(left, right): i = j = inv = 0 merged = [] while i < len(left) and j < len(right): if left[i] <= right[j]: merged.append(left[i]) i += 1 else: merged.append(right[j]) j += 1 inv += len(left) - i merged += left[i:] merged += right[j:] return merged, inv 
+10
source

You can use the excellent cProfile module to help you solve such things.

 >>> import cProfile >>> a = range(1,20000,2) >>> b = range(0,20000,2) >>> cProfile.run('merge1(a, b)') 70002 function calls in 0.195 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.184 0.184 0.195 0.195 <pyshell#7>:1(merge1) 1 0.000 0.000 0.195 0.195 <string>:1(<module>) 50000 0.008 0.000 0.008 0.000 {len} 19999 0.003 0.000 0.003 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} >>> cProfile.run('merge2(a, b)') 50004 function calls in 0.026 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.016 0.016 0.026 0.026 <pyshell#12>:1(merge2) 1 0.000 0.000 0.026 0.026 <string>:1(<module>) 10000 0.002 0.000 0.002 0.000 {len} 20000 0.003 0.000 0.003 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 20000 0.005 0.000 0.005 0.000 {method 'pop' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'reverse' of 'list' objects} 

With a little glance at the information, it seems that the commenters are correct - this is not a len function - this is a string module. The string module is called when comparing the length of things as follows:

 >>> cProfile.run('0 < len(c)') 3 function calls in 0.000 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 {len} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 

It is also called when slicing a list, but it is a very fast operation.

 >>> len(c) 20000000 >>> cProfile.run('c[3:2000000]') 2 function calls in 0.011 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.011 0.011 0.011 0.011 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 

TL; DR: something in the line module takes 0.195s in your first function and 0.026s in your second function. : apparently rebuilding the array in inv += len(left[i:]) this line.

+3
source

If I were to guess, I would say that this is probably due to the cost of removing items from the list, removing from the end (pop) is faster than deleting from the very beginning. The second method removes items from the end of the list.

See performance notes: http://effbot.org/zone/python-list.htm

"The time it takes to delete an item is about the same as the time it takes to insert an item at the same place, removing items at the end is fast, removing items at the beginning is slow."

0
source

All Articles