This is not a trivial problem, because the data is biased. Performance depends on a long sequence. Let's take an example of a square problem: many, long, regular and zero sequences ( n_iter==n_reg==lag_mean ):
import numpy as np n_iter = 1000 n_reg = 1000 regular_sequence = np.arange(n_reg, dtype=np.int) lag_mean = n_reg
First your solution:
def seq_hybrid(): seqs = [np.concatenate((np.zeros(x, dtype=np.int), regular_sequence)) for x in lag_seq] seq = np.concatenate(seqs) return seq
Then pure numpy one:
def seq_numpy(): seq=np.zeros(lag_seq.sum()+n_iter*n_reg,dtype=int) cs=np.cumsum(lag_seq+n_reg)-n_reg indexes=np.add.outer(cs,np.arange(n_reg)) seq[indexes]=regular_sequence return seq
A to solve the cycle:
def seq_python(): seq=np.empty(lag_seq.sum()+n_iter*n_reg,dtype=int) i=0 for lag in lag_seq: for k in range(lag): seq[i]=0 i+=1 for k in range(n_reg): seq[i]=regular_sequence[k] i+=1 return seq
And just compilation in time with numba:
from numba import jit seq_numba=jit(seq_python)
Tests now:
In [96]: %timeit seq_hybrid() 10 loops, best of 3: 38.5 ms per loop In [97]: %timeit seq_numpy() 10 loops, best of 3: 34.4 ms per loop In [98]: %timeit seq_python() 1 loops, best of 3: 1.56 s per loop In [99]: %timeit seq_numba() 100 loops, best of 3: 12.9 ms per loop
In this case, your hybrid solution will be as fast as purely numpy, because performance depends mainly on the inner loop. And yours (zeros and concatenates) is numerical. As expected, the python solution is slower with a traditional 40x ratio. But numpy is not optimal here because it uses the fantastic indexing needed with inconsistent data. In this case, numba can help you: minimal operations are performed at the C level, for a gain of 120x this time compared to the python solution.
For other values โโof n_iter,n_reg gain compared to python solution:
n_iter= 1000, n_reg= 1000 : seq_numba 124, seq_hybrid 49, seq_numpy 44. n_iter= 10, n_reg= 100000 : seq_numba 123, seq_hybrid 104, seq_numpy 49. n_iter= 100000, n_reg= 10 : seq_numba 127, seq_hybrid 1, seq_numpy 42.