Pure Python faster than Numpy? can i make this numpy code faster?

I need to calculate min, max and mean from a specific list of faces / vertices. I tried to optimize these calculations using Numpy, but without success.

Here is my test case:

#!/usr/bin/python # -*- coding: iso-8859-15 -*- ''' Module Started 22 févr. 2013 @note: test case comparaison numpy vs python @author: Python4D/damien ''' import numpy as np import time def Fnumpy(vertices): np_vertices=np.array(vertices) _x=np_vertices[:,:,0] _y=np_vertices[:,:,1] _z=np_vertices[:,:,2] _min=[np.min(_x),np.min(_y),np.min(_z)] _max=[np.max(_x),np.max(_y),np.max(_z)] _mean=[np.mean(_x),np.mean(_y),np.mean(_z)] return _mean,_max,_min def Fpython(vertices): list_x=[item[0] for sublist in vertices for item in sublist] list_y=[item[1] for sublist in vertices for item in sublist] list_z=[item[2] for sublist in vertices for item in sublist] taille=len(list_x) _mean=[sum(list_x)/taille,sum(list_y)/taille,sum(list_z)/taille] _max=[max(list_x),max(list_y),max(list_z)] _min=[min(list_x),min(list_y),min(list_z)] return _mean,_max,_min if __name__=="__main__": vertices=[[[1.1,2.2,3.3,4.4]]*4]*1000000 _t=time.clock() print ">>NUMPY >>{} for {}s.".format(Fnumpy(vertices),time.clock()-_t) _t=time.clock() print ">>PYTHON>>{} for {}s.".format(Fpython(vertices),time.clock()-_t) 

Results:

Numpy:

([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 27.327068618s.

Python:

([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 1.81366938593s.

Pure Python is 15 times faster than Numpy!

+4
source share
2 answers

The reason your Fnumpy is slower is because it contains an extra step not performed by Fpython : creating a numpy array in memory. If you move the line np_verticies=np.array(verticies) outside of Fnumpy and the timeline, your results will be different:

 >>NUMPY >>([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 0.500802s. >>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 2.182239s. 

You can also significantly speed up the selection step by providing a hint for the numpy data type when creating it. If you tell Numpy you have an array of floats, then even if you leave the np.array() call in the synchronization loop, it will beat the pure python version.

If I change np_vertices=np.array(vertices) to np_vertices=np.array(vertices, dtype=np.float_) and save it in Fnumpy , the Fnumpy version will beat Fpython , although this should do a lot more work:

 >>NUMPY >>([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 1.586066s. >>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 2.196787s. 
+10
source

As others have already pointed out, your problem is converting from a list to an array. Using the appropriate numpy functions for this, you will beat Python. I changed the main part of your program:

 if __name__=="__main__": _t = time.clock() vertices_np = np.resize(np.array([ 1.1, 2.2, 3.3, 4.4 ], dtype=np.float64), (1000000, 4, 4)) print "Creating numpy vertices: {}".format(time.clock() - _t) _t = time.clock() vertices=[[[1.1,2.2,3.3,4.4]]*4]*1000000 print "Creating python vertices: {}".format(time.clock() - _t) _t=time.clock() print ">>NUMPY >>{} for {}s.".format(Fnumpy(vertices_np),time.clock()-_t) _t=time.clock() print ">>PYTHON>>{} for {}s.".format(Fpython(vertices),time.clock()-_t) 

Running the code with the modified main part on my computer in:

 Creating numpy vertices: 0.6 Creating python vertices: 0.01 >>NUMPY >>([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 0.5s. >>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 1.91s. 

Although creating an array is still somewhat longer with Numpy tools like creating nested lists with the python list multiplication operator (0.6 s versus 0.01 s), you get a factor of approx. 4 for the corresponding part of the execution code. If I replace the line:

 np_vertices=np.array(vertices) 

with

 np_vertices = np.asarray(vertices) 

to avoid copying a large array, the running time of the numpy function even drops to 0.37 s on my machine, more than 5 times faster than a pure version of python.

In your real code, if you know the number of vertices in advance, you can pre-allocate the corresponding array using np.empty() , then fill it with the appropriate data and pass it to the numpy version of your function.

+2
source