Combining multiple hdf5 files into one pytable

I have several hdf5 files, each of which has the same structure. I would like to create one pytable of them by merging hdf5 files.

I mean, if the array in file1 has size x and the array in file2 has size y, the resulting array in pytable will have size x + y, containing first all the entries from file1, and then all the entries from file2.

+3
source share
1 answer

How you want to do this depends a little on the type of data you have. Arrays and CArrays are static in size, so you need to pre-allocate the data space. So you would do something like the following:

 import tables as tb file1 = tb.open_file('/path/to/file1', 'r') file2 = tb.open_file('/path/to/file2', 'r') file3 = tb.open_file('/path/to/file3', 'r') x = file1.root.x y = file2.root.y z = file3.create_array('/', 'z', atom=x.atom, shape=(x.nrows + y.nrows,)) z[:x.nrows] = x[:] z[x.nrows:] = y[:] 

However, EArrays and Tables are extensible. This way you do not need to pre-size the size and use copy_node () and append () instead.

 import tables as tb file1 = tb.open_file('/path/to/file1', 'r') file2 = tb.open_file('/path/to/file2', 'r') file3 = tb.open_file('/path/to/file3', 'r') x = file1.root.x y = file2.root.y z = file1.copy_node('/', name='x', newparent=file3.root, newname='z') z.append(y) 
+4
source

All Articles