Return from multi-index to a single index frame in pandas

NI YEAR MONTH datetime 2000 1 2000-01-01 NaN 2000-01-02 NaN 2000-01-03 NaN 2000-01-04 NaN 2000-01-05 NaN 

In the data frame above, I have a multilevel index consisting of columns:

 names=[u'YEAR', u'MONTH', u'datetime'] 

How do I return to a data frame with "datetime" as an index and "YEAR" and "MONTH" as regular columns?

+27
python pandas
source share
2 answers

pass level=[0,1] to just reset these levels:

 dist_df = dist_df.reset_index(level=[0,1]) In [28]: df.reset_index(level=[0,1]) Out[28]: YEAR MONTH NI datetime 2000-01-01 2000 1 NaN 2000-01-02 2000 1 NaN 2000-01-03 2000 1 NaN 2000-01-04 2000 1 NaN 2000-01-05 2000 1 NaN 

You can pass label names alternatively:

 df.reset_index(level=['YEAR','MONTH']) 
+29
source share

Starting with version 0.24.0 for pandas, .to_flat_index() is the "official" way for pandas to do what is written on the label: MultiIndex alignment.

It also has an added advantage over existing answers such as .reset_index(level=[0,1]) , since it is versatile enough to apply to both the row and the MultiIndex column .

From panda own documentation:

MultiIndex.to_flat_index ()

Convert MultiIndex to a tuple index containing level values.

A simple example from its documentation:

 import pandas as pd print(pd.__version__) # '0.23.4' index = pd.MultiIndex.from_product( [['foo', 'bar'], ['baz', 'qux']], names=['a', 'b']) print(index) # MultiIndex(levels=[['bar', 'foo'], ['baz', 'qux']], # codes=[[1, 1, 0, 0], [0, 1, 0, 1]], # names=['a', 'b']) Applying to_flat_index(): index.to_flat_index() # Index([('foo', 'baz'), ('foo', 'qux'), ('bar', 'baz'), ('bar', 'qux')], dtype='object') 

Using it to replace an existing pandas column works basically the same as an index:

 dat = df.loc[:,['name','workshop_period','class_size']].groupby(['name','workshop_period']).describe() print(dat.columns) # MultiIndex(levels=[['class_size'], ['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max']], # codes=[[0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5, 6, 7]]) dat.columns = dat.columns.to_flat_index() print(dat.columns) # Index([('class_size', 'count'), ('class_size', 'mean'), # ('class_size', 'std'), ('class_size', 'min'), # ('class_size', '25%'), ('class_size', '50%'), # ('class_size', '75%'), ('class_size', 'max')], # dtype='object') 
0
source share

All Articles