Random number generation with random numbers is slower after vectorization

I noticed that trying to speed up numpy code, which involves generating a large number of random numbers by vectorizing python loops for, may have the opposite result and may slow it down. Output of the next bit of code: took time 0.588and took time 0.789. This runs counter to my intuition about how best to write numpy code, and I was wondering why this would be so?

import time
import numpy as np

N = 50000
M = 1000
repeats = 10

start = time.time()
for i in range(repeats):
    for j in range(M):
        r = np.random.randint(0,N,size=N)
print 'took time ',(time.time()-start)/repeats

start = time.time()
for i in range(repeats):
    r = np.random.randint(0,N,size=(N,M))
print 'took time ',(time.time()-start)/repeats
+6
source share
1 answer

IMO, your friend is not entirely fair - what about the measurement time for building a 2D array from a list of 1D arrays?

In [127]: %timeit np.random.randint(0,N,size=(N,M))
1.32 s ± 24.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [128]: %timeit np.column_stack(np.random.randint(0,N,size=N) for _ in range(M))
2.73 s ± 135 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
0
source

All Articles