As others have already pointed out, your problem is converting from a list to an array. Using the appropriate numpy functions for this, you will beat Python. I changed the main part of your program:
if __name__=="__main__": _t = time.clock() vertices_np = np.resize(np.array([ 1.1, 2.2, 3.3, 4.4 ], dtype=np.float64), (1000000, 4, 4)) print "Creating numpy vertices: {}".format(time.clock() - _t) _t = time.clock() vertices=[[[1.1,2.2,3.3,4.4]]*4]*1000000 print "Creating python vertices: {}".format(time.clock() - _t) _t=time.clock() print ">>NUMPY >>{} for {}s.".format(Fnumpy(vertices_np),time.clock()-_t) _t=time.clock() print ">>PYTHON>>{} for {}s.".format(Fpython(vertices),time.clock()-_t)
Running the code with the modified main part on my computer in:
Creating numpy vertices: 0.6 Creating python vertices: 0.01 >>NUMPY >>([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 0.5s. >>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 1.91s.
Although creating an array is still somewhat longer with Numpy tools like creating nested lists with the python list multiplication operator (0.6 s versus 0.01 s), you get a factor of approx. 4 for the corresponding part of the execution code. If I replace the line:
np_vertices=np.array(vertices)
with
np_vertices = np.asarray(vertices)
to avoid copying a large array, the running time of the numpy function even drops to 0.37 s on my machine, more than 5 times faster than a pure version of python.
In your real code, if you know the number of vertices in advance, you can pre-allocate the corresponding array using np.empty() , then fill it with the appropriate data and pass it to the numpy version of your function.