Multiindex are just some of the columns in Pandas

Question

Multiindex are just some of the columns in Pandas

I have a csv that is generated in a format that I cannot change. The file has a multi-index. The file is as follows.

The ultimate goal is to turn the top row (hours) into an index and index it with an “ID” column so that the data looks like this.

I imported the file in pandas ...

myfile = 'c:/temp/myfile.csv'
df = pd.read_csv(myfile, header=[0, 1], tupleize_cols=True)
pd.set_option('display.multi_sparse', False)
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['hour', 'field'])
df

But this gives me three unnamed fields:

My last step is to add per hour:

df.stack(level=['hour'])

But I skip that before that, where I can index other columns, even if there is a row with several integer indices.

+4

python pandas dataframe multi-index

Sir Larry Wildman Mar 11 '16 at 23:34

source share

1 answer

Yakym Pirozhenko · Accepted Answer · 2016-03-11T23:58:00+0000

, , , # 3 4:

df = pd.io.parsers.read_csv('temp.csv', header = [0,1], tupleize_cols = True)
df.columns = [c for _, c in df.columns[:3]] + [c for c in df.columns[3:]]
df = df.set_index(list(df.columns[:3]), append = True)
df.columns = pd.MultiIndex.from_tuples(df.columns, names = ['hour', 'field'])

, 3 col. .
, .

stack reset , .

.

  (Unnamed: 0_level_0, Date)  (Unnamed: 1_level_0, id)  \
0                  3/11/2016                         5   
1                  3/11/2016                         6   

  (Unnamed: 2_level_0, zone)  (100, p1)  (100, p2)  (200, p1)  (200, p2)  
0                        abc      0.678      0.787      0.337      0.979  
1                        abc      0.953      0.559      0.776      0.520

field                        p1     p2
  Date      id zone hour              
0 3/11/2016 5  abc  100   0.678  0.787
                    200   0.337  0.979
1 3/11/2016 6  abc  100   0.953  0.559
                    200   0.776  0.520

Multiindex are just some of the columns in Pandas

More articles: