First demo with a very small matrix
In [523]: idx=np.arange(0,8,2) In [526]: D=np.arange(24).reshape(8,3) In [527]: Dll=sparse.lil_matrix(D) In [528]: D[idx,:].sum(axis=0) Out[528]: array([36, 40, 44]) In [529]: Dll[idx,:].sum(axis=0) Out[529]: matrix([[36, 40, 44]], dtype=int32) In [530]: timeit D[idx,:].sum(axis=0) 100000 loops, best of 3: 17.3 µs per loop In [531]: timeit Dll[idx,:].sum(axis=0) 1000 loops, best of 3: 1.16 ms per loop In [532]: score=np.zeros(3)
For some operations, the csr format is faster:
In [537]: timeit Dll.tocsr()[idx,:].sum(axis=0) 1000 loops, best of 3: 955 µs per loop
or if I pre-convert to csr:
In [538]: Dcsr=Dll.tocsr() In [539]: timeit Dcsr[idx,:].sum(axis=0) 1000 loops, best of 3: 724 µs per loop
Still slow relatively dense.
I was going to talk about working with sparse matrix data attributes as a way to quickly select rows. But if the only purpose for selecting these lines is to summarize their values, we do not need to do this.
The sum of sparse matrices in rows or columns, performing a matrix product with a column or row matrix of them. And I answered another question with the same answer.
fooobar.com/questions/1006020 / ... Efficiently compute columnwise sum of sparse array where every non-zero element is 1
For instance:
In [588]: I=np.asmatrix(np.zeros((1,Dll.shape[0]))) In [589]: I[:,idx]=1 In [590]: I Out[590]: matrix([[ 1., 0., 1., 0., 1., 0., 1., 0.]]) In [591]: I*Dll Out[591]: matrix([[ 36., 40., 44.]]) In [592]: %%timeit I=np.asmatrix(np.zeros((1,Dll.shape[0]))) I[:,idx]=1 I*Dll .....: 1000 loops, best of 3: 919 µs per loop
For this small matrix, this did not help the speed, but with Dcsr time drops to 215 µs (this is much better for mathematics). With large matrices, this version of the product will improve.
==================
I just found out in another question that the row selection A_csr[[1,1,0,3],:] actually done using the matrix product. It creates a csr extractor matrix that looks like
matrix([[0, 1, 0, 0], [0, 1, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]])
fooobar.com/questions/1006022 / ...