The pandas search item element is an expensive operation, so it is aligned by index. I read everything into arrays, created a DataFrame of values, and then set the hierarchical index directly. Usually much faster if you can avoid adding or searching.
Here is an example of a result assuming that you have an array of 2-D datasets with everything concentrated in:
In [106]: dataset Out[106]: array([[1, 1, 0, 1], [1, 1, 1, 2], [1, 2, 1, 3], [1, 2, 2, 4], [2, 1, 0, 5], [2, 1, 2, 6]]) In [107]: pd.DataFrame(dataset,columns=['A','B','C', 'data']).set_index(['A', 'B', 'C']) ...: Out[107]: data ABC 1 1 0 1 1 2 2 1 3 2 4 2 1 0 5 2 6 In [108]: data_values = dataset[:, 3] ...: data_index = pd.MultiIndex.from_arrays( dataset[:,:3].T, names=list('ABC')) ...: pd.DataFrame(data_values, columns=['data'], index=data_index) ...: Out[108]: data ABC 1 1 0 1 1 2 2 1 3 2 4 2 1 0 5 2 6 In [109]: %timeit pd.DataFrame(dataset,columns=['A','B','C', 'data']).set_index(['A', 'B', 'C']) %%timeit 1000 loops, best of 3: 1.75 ms per loop In [110]: %%timeit ...: data_values = dataset[:, 3] ...: data_index = pd.MultiIndex.from_arrays( dataset[:,:3].T, names=list('ABC')) ...: pd.DataFrame(data_values, columns=['data'], index=data_index) ...: 1000 loops, best of 3: 642 ยตs per loop