Comparison of the list of lists and explicit loops (3 array generators are faster than 1 for a loop)

Question

Comparison of the list of lists and explicit loops (3 array generators are faster than 1 for a loop)

I did my homework, and I accidentally discovered a strange inconsistency in the speed of the algorithm. Here are 2 versions of the code for the same bur function with 1 difference: in the first version, I use a 3x array generator to filter some array, and in the second version I use 1 for a loop with 3 if statements to do the same filter job.

So here is the version 1 code:

def kth_order_statistic(array, k): pivot = (array[0] + array[len(array) - 1]) // 2 l = [x for x in array if x < pivot] m = [x for x in array if x == pivot] r = [x for x in array if x > pivot] if k <= len(l): return kth_order_statistic(l, k) elif k > len(l) + len(m): return kth_order_statistic(r, k - len(l) - len(m)) else: return m[0]

And here is the code for the second version:

 def kth_order_statistic2(array, k): pivot = (array[0] + array[len(array) - 1]) // 2 l = [] m = [] r = [] for x in array: if x < pivot: l.append(x) elif x > pivot: r.append(x) else: m.append(x) if k <= len(l): return kth_order_statistic2(l, k) elif k > len(l) + len(m): return kth_order_statistic2(r, k - len(l) - len(m)) else: return m[0]

IPython for version 1:

 In [4]: %%timeit ...: A = range(100000) ...: shuffle(A) ...: k = randint(1, len(A)-1) ...: order_statisctic(A, k) ...: 10 loops, best of 3: 120 ms per loop

And for the second version:

 In [5]: %%timeit ...: A = range(100000) ...: shuffle(A) ...: k = randint(1, len(A)-1) ...: kth_order_statistic2(A, k) ...: 10 loops, best of 3: 169 ms per loop

So why is the first version faster than the second? I also made a third version using the filter () function instead of an array generator, and it was slower than the second version (it got 218 ms per cycle)

+6

python arrays python-2.7 time list-comprehension

KgOfHedgehogs 15 sept. '16 at 19:28

source share

4 answers

Using a simple for is faster than list comprehesion . It is almost 2 times faster. Check the results below:

Using list comprehension : 58 usec

 moin@moin-pc :~$ python -m timeit "[i for i in range(1000)]" 10000 loops, best of 3: 58 usec per loop

Using a for loop: 37.1 usec

 moin@moin-pc :~$ python -m timeit "for i in range(1000): i" 10000 loops, best of 3: 37.1 usec per loop

But in your case, for takes longer than understanding the list, not because YOUR for loop is slower. But because of .append() you are using code.

From append() to for loop`: 114 usec

 moin@moin-pc :~$ python -m timeit "my_list = []" "for i in range(1000): my_list.append(i)" 10000 loops, best of 3: 114 usec per loop

Which clearly shows that it is .append() , which takes twice as much time spent on the for loop .

However, on storing the "list.append" in different variable : 69.3 usec

 moin@moin-pc :~$ python -m timeit "my_list = []; append = my_list.append" "for i in range(1000): append(i)" 10000 loops, best of 3: 69.3 usec per loop

There is a significant performance improvement over the last case in the comparison above, and the result is quite comparable to the result of list comprehension . This means that instead of calling my_list.append() performance can be improved each time by storing a function reference in another variable ie append_func = my_list.append and making a call using this variable append_func(i) .

Which also proves that it is faster to call a class function stored in a variable compared to directly calling a function using a class object .

Thank you Stefan for recording the latest case.

+8

Moinuddin quadri 15 sept. '16 at 19:32

source share

Let this doubt dissipate: the second version is a little faster: list recognition is faster , but two loop cycles and as many conventions are discarded in one iteration.

 def kth_order_statistic1(array,k): pivot = (array[0] + array[len(array) - 1]) // 2 l = [x for x in array if x < pivot] m = [x for x in array if x == pivot] r = [x for x in array if x > pivot] if k <= len(l): return kth_order_statistic1(l, k) elif k > len(l) + len(m): return kth_order_statistic1(r, k - len(l) - len(m)) else: return m[0] def kth_order_statistic2(array,k): pivot = (array[0] + array[len(array) - 1]) // 2 l = [] m = [] r = [] for x in array: if x < pivot: l.append(x) elif x > pivot: r.append(x) else: m.append(x) if k <= len(l): return kth_order_statistic2(l, k) elif k > len(l) + len(m): return kth_order_statistic2(r, k - len(l) - len(m)) else: return m[0] def kth_order_statistic3(array,k): pivot = (array[0] + array[len(array) - 1]) // 2 l = [] m = [] r = [] for x in array: if x < pivot: l.append(x) for x in array: if x== pivot: m.append(x) for x in array: if x > pivot: r.append(x) if k <= len(l): return kth_order_statistic3(l, k) elif k > len(l) + len(m): return kth_order_statistic3(r, k - len(l) - len(m)) else: return m[0] import time import random if __name__ == '__main__': A = range(100000) random.shuffle(A) k = random.randint(1, len(A)-1) start_time = time.time() for x in range(1000) : kth_order_statistic1(A,k) print("--- %s seconds ---" % (time.time() - start_time)) start_time = time.time() for x in range(1000) : kth_order_statistic2(A,k) print("--- %s seconds ---" % (time.time() - start_time)) start_time = time.time() for x in range(1000) : kth_order_statistic3(A,k) print("--- %s seconds ---" % (time.time() - start_time))

 python : --- 25.8894710541 seconds --- --- 24.073086977 seconds --- --- 32.9823839664 seconds --- ipython --- 25.7450709343 seconds --- --- 22.7140650749 seconds --- --- 35.2958850861 seconds ---

The time may vary depending on a random draw, but the differences between the three are almost the same.

+3

Flint 15 sept. '16 at 20:59

source share

The algorithm structure is different and the conditional structure must be charged. the test added to r and m may be dropped by the previous test. A stricter comparison in the for loop with append , and list comprehension will be against the non-optimal

 for x in array: if x < pivot: l.append(x) for x in array: if x== pivot: m.append(x) for x in array: if x > pivot: r.append(x)

+2

Flint 15 sept. '16 at 19:46

source share

Riccardo petraglia · Accepted Answer · 2016-09-15T20:17:32+0000

We define the functions that we need to answer the question and time:

 In [18]: def iter(): l = [x for x in range(100) if x > 10] ....: In [19]: %timeit iter() 100000 loops, best of 3: 7.92 µs per loop In [20]: def loop(): l = [] for x in range(100): if x > 10: l.append(x) ....: In [21]: %timeit loop() 10000 loops, best of 3: 20 µs per loop In [22]: def loop_fast(): l = [] for x in range(100): if x > 10: pass ....: In [23]: %timeit loop_fast() 100000 loops, best of 3: 4.69 µs per loop

we see that for loops without the append command are as fast as list comprehension. In fact, if we look at the bytecode, we will see that if we understand the list, python can use the built-in bytecode command LIST_APPEND instead:

Download list: 40 LOAD_FAST
Download attribute: 43 LOAD_ATTRIBUTE
Call the loaded function: 49 CALL_FUNCTION
Upload list (?): 52 POP_TOP

As you can see from the output below, the previous bytecode is missing with the list and the loop_fast function. Comparing the time of the three functions it is clear that they are responsible for the different times of the three methods.

 In [27]: dis.dis(iter) 2 0 BUILD_LIST 0 3 LOAD_GLOBAL 0 (range) 6 LOAD_CONST 1 (1) 9 LOAD_CONST 2 (100) 12 CALL_FUNCTION 2 15 GET_ITER >> 16 FOR_ITER 24 (to 43) 19 STORE_FAST 0 (x) 22 LOAD_FAST 0 (x) 25 LOAD_CONST 2 (100) 28 COMPARE_OP 4 (>) 31 POP_JUMP_IF_FALSE 16 34 LOAD_FAST 0 (x) 37 LIST_APPEND 2 40 JUMP_ABSOLUTE 16 >> 43 STORE_FAST 1 (l) 46 LOAD_CONST 0 (None) 49 RETURN_VALUE In [28]: dis.dis(loop) 2 0 BUILD_LIST 0 3 STORE_FAST 0 (1) 3 6 SETUP_LOOP 51 (to 60) 9 LOAD_GLOBAL 0 (range) 12 LOAD_CONST 1 (1) 15 LOAD_CONST 2 (100) 18 CALL_FUNCTION 2 21 GET_ITER >> 22 FOR_ITER 34 (to 59) 25 STORE_FAST 1 (x) 4 28 LOAD_FAST 1 (x) 31 LOAD_CONST 3 (10) 34 COMPARE_OP 4 (>) 37 POP_JUMP_IF_FALSE 22 5 40 LOAD_FAST 0 (l) 43 LOAD_ATTR 1 (append) 46 LOAD_FAST 1 (x) 49 CALL_FUNCTION 1 52 POP_TOP 53 JUMP_ABSOLUTE 22 56 JUMP_ABSOLUTE 22 >> 59 POP_BLOCK >> 60 LOAD_CONST 0 (None) 63 RETURN_VALUE In [29]: dis.dis(loop_fast) 2 0 BUILD_LIST 0 3 STORE_FAST 0 (1) 3 6 SETUP_LOOP 38 (to 47) 9 LOAD_GLOBAL 0 (range) 12 LOAD_CONST 1 (1) 15 LOAD_CONST 2 (100) 18 CALL_FUNCTION 2 21 GET_ITER >> 22 FOR_ITER 21 (to 46) 25 STORE_FAST 1 (x) 4 28 LOAD_FAST 1 (x) 31 LOAD_CONST 3 (10) 34 COMPARE_OP 4 (>) 37 POP_JUMP_IF_FALSE 22 5 40 JUMP_ABSOLUTE 22 43 JUMP_ABSOLUTE 22 >> 46 POP_BLOCK >> 47 LOAD_CONST 0 (None) 50 RETURN_VALUE

Comparison of the list of lists and explicit loops (3 array generators are faster than 1 for a loop)

More articles: