I would like to multiply two large sparse matrices. The first is 150,000 × 300,000, and the second is 300,000 × 300,000. The first matrix contains about 1,000,000 non-zero elements, and the second matrix contains about 20,000,000 non-zero elements. Is there an easy way to get the product of these matrices?
I am currently storing matrices in csr or csc format and trying matrix_a * matrix_b . This gives a ValueError: array is too big error.
I suggest that I could store individual matrices on disk using pytables, split them into smaller blocks and construct the final matrix product from the products of many blocks. But I hope for something relatively simple embodiment.
EDIT: I hope for a solution that works for arbitrarily large sparse matrices, while hiding (or avoiding) bookkeeping, participating in moving individual blocks back and forth between memory and disk.
Danb
source share