How to load an array of row cells in Matlab mat files into a Python list or tuple using Scipy.io.loadmat

I am a Matlab user new to Python. I would like to write an array of row cells in Matlab to a Mat file and load this Mat file using Python (possibly scipy.io.loadmat) into some similar type (like a list of strings or tuples of strings). But loadmat reads things in an array, and I'm not sure how to convert it to a list. I tried the "list" function, which does not work as I expected (I have a bad idea about the Python array or the numpy array). For example:

Matlab Code:

cell_of_strings = {'thank', 'you', 'very', 'much'}; save('my.mat', 'cell_of_strings'); 

Python Code:

 matdata=loadmat('my.mat', chars_as_strings=1, matlab_compatible=1); array_of_strings = matdata['cell_of_strings'] 

Then the variable array_of_strings:

 array([[[[u't' u'h' u'a' u'n' u'k']], [[u'y' u'o' u'u']], [[u'v' u'e' u'r' u'y']], [[u'm' u'u' u'c' u'h']]]], dtype=object) 

I'm not sure how to convert this st_strings array to a Python list or tuple so that it looks like

 list_of_strings = ['thank', 'you', 'very', 'much']; 

I am not familiar with an array object in Python or numpy. Your help would be greatly appreciated.

+6
python string arrays matlab mat-file
source share
2 answers

Try it:

 import scipy.io as si a = si.loadmat('my.mat') b = a['cell_of_strings'] # type(b) <type 'numpy.ndarray'> list_of_strings = b.tolist() # type(list_of_strings ) <type 'list'> print list_of_strings # output: [u'thank', u'you', u'very', u'much'] 
+4
source share

It looks like a job to understand the list . Repeating your example, I did this in MATLAB:

 cell_of_strings = {'thank', 'you', 'very', 'much'}; save('my.mat', 'cell_of_strings','-v7'); 

I am using a newer version of MATLAB, which saves .mat files in HDF5 format by default. loadmat cannot read HDF5 files, so the '-v7' flag should force MATLAB to save the old .mat , which loadmat can understand.

In Python, I loaded an array of cells just like you:

 import scipy.io as sio matdata = sio.loadmat('%s/my.mat' %path, chars_as_strings=1, matlab_compatible=1); array_of_strings = matdata['cell_of_strings'] 

Printing array_of_strings gives:

 [[array([[u't', u'h', u'a', u'n', u'k']], dtype='<U1') array([[u'y', u'o', u'u']], dtype='<U1') array([[u'v', u'e', u'r', u'y']], dtype='<U1') array([[u'm', u'u', u'c', u'h']], dtype='<U1')]] 

The variable array_of_strings is an array of (1,4) numpy objects, but each object has arrays. For example, the first element of array_of_strings is an array (1,5) containing the letters for "thanks." I.e

 array_of_strings[0,0] array([[u't', u'h', u'a', u'n', u'k']], dtype='<U1') 

To get to the first letter "t", you need to do something like:

 array_of_strings[0,0][0,0] u't' 

Since we are dealing with nested arrays, we need to use some recursive technique to extract the data, i.e. nested for loops. But first, I'll show you how to extract the first word:

 first_word = [str(''.join(letter)) for letter in array_of_strings[0][0]] first_word ['thank'] 

Here I use list comprehension. Basically, I iterate over each letter in array_of_strings [0] [0] and concatenate them using the ''.join . The string() function is to convert unicode strings to regular strings.

Now, to get the desired list line, we just need to skip each array of letters:

 words = [str(''.join(letter)) for letter_array in array_of_strings[0] for letter in letter_array] words ['thank', 'you', 'very', 'much'] 

Understanding the lists takes some getting used to, but they are extremely helpful. Hope this helps.

+2
source share

All Articles