I have a very large .mat file (~ 1.3 GB) that I am trying to load in my Python code (IPython laptop). I tried:
import scipy.io as sio very_large = sio.loadmat('very_large.mat')
And my laptop with 8 GB of RAM hangs. I left the system monitor open and saw that the memory consumption is constantly increasing to 7 GB, and then the system freezes.
What am I doing wrong? Any suggestion / work around?
EDIT:
Data Details: Here is the data link: http://ufldl.stanford.edu/housenumbers/
The specific file of my interest is extra_32x32.mat. From the description: Downloading .mat files creates two variables: X, which is a four-dimensional matrix containing images, and y, which is a vector of class labels. To access images, X (:,:,:,, i) gives the i-th 32-bit RGB image with a class label of y (i).
So, for example, a smaller .mat file from the same page (test_32x32.mat) when loaded as follows:
SVHN_full_test_data = sio.loadmat('test_32x32.mat') print("\nData set = SVHN_full_test_data") for key, value in SVHN_full_test_data.iteritems(): print("Type of", key, ":", type(SVHN_full_test_data[key])) if str(type(SVHN_full_test_data[key])) == "<type 'numpy.ndarray'>": print("Shape of", key, ":", SVHN_full_test_data[key].shape) else: print("Content:", SVHN_full_test_data[key])
gives:
Data set = SVHN_full_test_data Type of y : <type 'numpy.ndarray'> Shape of y : (26032, 1) Type of X : <type 'numpy.ndarray'> Shape of X : (32, 32, 3, 26032) Type of __version__ : <type 'str'> Content: 1.0 Type of __header__ : <type 'str'> Content: MATLAB 5.0 MAT-file, Platform: GLNXA64, Created on: Mon Dec 5 21:18:15 2011 Type of __globals__ : <type 'list'> Content: []