Why is numpy slower than python? How to make code more efficient

I am rewriting my neural network from pure python to numpy, but now it works even slower. So I tried these two functions:

def d(): a = [1,2,3,4,5] b = [10,20,30,40,50] c = [i*j for i,j in zip(a,b)] return c def e(): a = np.array([1,2,3,4,5]) b = np.array([10,20,30,40,50]) c = a*b return c 

timeit d = 1.77135205057

timeit e = 17.2464673758

The nenushka is 10 times slower. Why is this and how to use numpy correctly?

+7
source share
4 answers

I would suggest that the discrepancy is due to the fact that you build lists and arrays in e , whereas you only create lists in d . Consider:

 import numpy as np def d(): a = [1,2,3,4,5] b = [10,20,30,40,50] c = [i*j for i,j in zip(a,b)] return c def e(): a = np.array([1,2,3,4,5]) b = np.array([10,20,30,40,50]) c = a*b return c #Warning: Functions with mutable default arguments are below. # This code is only for testing and would be bad practice in production! def f(a=[1,2,3,4,5],b=[10,20,30,40,50]): c = [i*j for i,j in zip(a,b)] return c def g(a=np.array([1,2,3,4,5]),b=np.array([10,20,30,40,50])): c = a*b return c import timeit print timeit.timeit('d()','from __main__ import d') print timeit.timeit('e()','from __main__ import e') print timeit.timeit('f()','from __main__ import f') print timeit.timeit('g()','from __main__ import g') 

Here the functions f and g avoid re-creating lists / arrays every time, and we get very similar performance:

 1.53083586693 15.8963699341 1.33564996719 1.69556999207 

Note that list-comp + zip still wins. However, if we make the arrays large enough, numpy wins hands down:

 t1 = [1,2,3,4,5] * 100 t2 = [10,20,30,40,50] * 100 t3 = np.array(t1) t4 = np.array(t2) print timeit.timeit('f(t1,t2)','from __main__ import f,t1,t2',number=10000) print timeit.timeit('g(t3,t4)','from __main__ import g,t3,t4',number=10000) 

My results:

 0.602419137955 0.0263929367065 
+14
source
 import time , numpy def d(): a = range(100000) b =range(0,1000000,10) c = [i*j for i,j in zip(a,b)] return c def e(): a = numpy.array(range(100000)) b =numpy.array(range(0,1000000,10)) c = a*b return c #python ['0.04s', '0.04s', '0.04s'] #numpy ['0.02s', '0.02s', '0.02s'] 

try it with large arrays ... even with the overhead of creating numpy arrays much faster

+3
source

Numpy data structures are slower when adding / building

Here are some tests:

 from timeit import Timer setup1 = '''import numpy as np a = np.array([])''' stmnt1 = 'np.append(a, 1)' t1 = Timer(stmnt1, setup1) setup2 = 'l = list()' stmnt2 = 'l.append(1)' t2 = Timer(stmnt2, setup2) print('appending to empty list:') print(t1.repeat(number=1000)) print(t2.repeat(number=1000)) setup1 = '''import numpy as np a = np.array(range(999999))''' stmnt1 = 'np.append(a, 1)' t1 = Timer(stmnt1, setup1) setup2 = 'l = [x for x in xrange(999999)]' stmnt2 = 'l.append(1)' t2 = Timer(stmnt2, setup2) print('appending to large list:') print(t1.repeat(number=1000)) print(t2.repeat(number=1000)) 

Results:

 appending to empty list: [0.008171333983972538, 0.0076482562944814175, 0.007862921943675175] [0.00015624398517267296, 0.0001191077336243837, 0.000118654852507942] appending to large list: [2.8521017080411304, 2.8518707386717446, 2.8022625940577477] [0.0001643958452675065, 0.00017888804099541744, 0.00016711313196715594] 
+2
source

I don't think numpy is slow because it needs to take into account the time it takes to write and debug. The longer the program, the more difficult it is to find problems or add new functions (programmer time). Thus, the use of a higher-level language allows, with equal intelligence and skills, to create a software package and is potentially more efficient.

In any case, some interesting optimization tools:

- Psyco is a JIT (just in time, "in real time") that optimizes during code execution.

- Numexpr , parallelization is a good way to speed up program execution, provided that it is fairly separable.

- weave is a module inside NumPy for communicating with Python and C. One of its functions is a blitz that takes a Python string, transparently translates C, and every time an optimized version is called. It takes about a second to create this first conversion, but higher speeds usually get all of the above. This is not Numexpr or Psyco bytecode, or a C interface like NumPy, but your own function, written directly in C and fully compiled and optimized.

-one
source

All Articles