Creating a huge numpy array using pytables

How to create a huge numpy array using pytables. I tried this, but gave me a "ValueError: array is too big". Mistake:

import numpy as np import tables as tb ndim = 60000 h5file = tb.openFile('test.h5', mode='w', title="Test Array") root = h5file.root h5file.createArray(root, "test", np.zeros((ndim,ndim), dtype=float)) h5file.close() 
+7
source share
2 answers

You can try using the table.CArray class as it supports compression, but ...

I think the questions are more about numpy than pytables, because you create an array using numpy before storing it with pytables.

So you need a lot of ram to execute np.zeros ((ndim, ndim) ), and this is probably the place where the exception is: "ValueError: array is too big". raised.

If the matrix / array is not dense, you can use the sparse matrix representation available in scipy: http://docs.scipy.org/doc/scipy/reference/sparse.html

Another solution is to try to access your array through chunks, if you do not need the whole array at once - look at this thread: Very large matrices using Python and NumPy

+8
source

Drop the @ b1r3k answer to create an array that you wonโ€™t have access to immediately (i.e. it will remember all this), you want to use CArray (Chunked Array). The idea is that you would then fill out and get access to it gradually:

 import numpy as np import tables as tb ndim = 60000 h5file = tb.openFile('test.h5', mode='w', title="Test Array") root = h5file.root x = h5file.createCArray(root,'x',tb.Float64Atom(),shape=(ndim,ndim)) x[:100,:100] = np.random.random(size=(100,100)) # Now put in some data h5file.close() 
+15
source

All Articles