How to unlock (or rotate?) In pandas

Question

How to unlock (or rotate?) In pandas

I have a dataframe that looks like this:

import pandas as pd datelisttemp = pd.date_range('1/1/2014', periods=3, freq='D') s = list(datelisttemp)*3 s.sort() df = pd.DataFrame({'BORDER':['GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY' ], 'HOUR1':[2 ,2 ,2 ,4 ,4 ,4 ,6 ,6, 6],'HOUR2':[3 ,3 ,3, 5 ,5 ,5, 7, 7, 7], 'HOUR3':[8 ,8 ,8, 12 ,12 ,12, 99, 99, 99]}, index=s)

This gives me:

 Out[458]: df BORDER HOUR1 HOUR2 HOUR3 2014-01-01 GERMANY 2 3 8 2014-01-01 FRANCE 2 3 8 2014-01-01 ITALY 2 3 8 2014-01-02 GERMANY 4 5 12 2014-01-02 FRANCE 4 5 12 2014-01-02 ITALY 4 5 12 2014-01-03 GERMANY 6 7 99 2014-01-03 FRANCE 6 7 99 2014-01-03 ITALY 6 7 99

I want the resulting data file to look something like this:

  HOUR GERMANY FRANCE ITALY 2014-01-01 1 2 2 2 2014-01-01 2 3 3 3 2014-01-01 3 8 8 8 2014-01-02 1 4 4 4 2014-01-02 2 5 5 5 2014-01-02 3 12 12 12 2014-01-03 1 6 6 6 2014-01-03 2 7 7 7 2014-01-03 3 99 99 99

I did the following, but I'm not quite there:

 df['date_col'] = df.index df2 = melt(df, id_vars=['date_col','BORDER']) #Can I keep the same index after melt or do I have to set an index like below? df2.set_index(['date_col', 'variable'], inplace=True, drop=True) df2 = df2.sort()

Df

 Out[465]: df2 BORDER value date_col variable 2014-01-01 HOUR1 GERMANY 2 HOUR1 FRANCE 2 HOUR1 ITALY 2 HOUR2 GERMANY 3 HOUR2 FRANCE 3 HOUR2 ITALY 3 HOUR3 GERMANY 8 HOUR3 FRANCE 8 HOUR3 ITALY 8 2014-01-02 HOUR1 GERMANY 4 HOUR1 FRANCE 4 HOUR1 ITALY 4 HOUR2 GERMANY 5 HOUR2 FRANCE 5 HOUR2 ITALY 5 HOUR3 GERMANY 12 HOUR3 FRANCE 12 HOUR3 ITALY 12 2014-01-03 HOUR1 GERMANY 6 HOUR1 FRANCE 6 HOUR1 ITALY 6 HOUR2 GERMANY 7 HOUR2 FRANCE 7 HOUR2 ITALY 7 HOUR3 GERMANY 99 HOUR3 FRANCE 99 HOUR3 ITALY 99

I thought I could put off df2 to get something similar to my last framework, but I get all kinds of errors. I also tried to rotate this data file, but cannot get what I want.

+8

python stack pandas pivot

codingknob Jul 08 '14 at 19:40

source share

3 answers

Using df2 :

 >>> df2.pivot_table(values='value', index=['DATE', 'variable'], columns="BORDER") BORDER FRANCE GERMANY ITALY DATE variable 2014-01-01 HOUR1 2 2 2 HOUR2 3 3 3 HOUR3 8 8 8 2014-01-02 HOUR1 4 4 4 HOUR2 5 5 5 HOUR3 12 12 12 2014-01-03 HOUR1 6 6 6 HOUR2 7 7 7 HOUR3 99 99 99 [9 rows x 3 columns]

There is a little more cleanup if you want to convert the index variable “variable” to the “HOUR” column and cross out the text “HOUR” from the values, but I think this is the main format you want.

+3

Brenbarn Jul 08 '14 at 20:08

source share

Try using pivot. You can do this in one line. For example.

 df.pivot(index='start_time', columns='venue_name', values='ocupation')

0

Rafaell Jun 06 '19 at 12:39

source share

unutbu · Accepted Answer · 2014-07-08T20:09:52+0000

We want values (for example, 'GERMANY' ) to become column names and column names (for example, 'HOUR1' ) to become 'HOUR1' values.

The stack method turns column names into index values and the unstack method turns index values into column names.

So by shifting the values to the index, we can use stack and unstack to do the swap.

 import pandas as pd datelisttemp = pd.date_range('1/1/2014', periods=3, freq='D') s = list(datelisttemp)*3 s.sort() df = pd.DataFrame({'BORDER':['GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY' ], 'HOUR1':[2 ,2 ,2 ,4 ,4 ,4 ,6 ,6, 6],'HOUR2':[3 ,3 ,3, 5 ,5 ,5, 7, 7, 7], 'HOUR3':[8 ,8 ,8, 12 ,12 ,12, 99, 99, 99]}, index=s) df = df.set_index(['BORDER'], append=True) df.columns.name = 'HOUR' df = df.unstack('BORDER') df = df.stack('HOUR') df = df.reset_index('HOUR') df['HOUR'] = df['HOUR'].str.replace('HOUR', '').astype('int') print(df)

gives

 BORDER HOUR FRANCE GERMANY ITALY 2014-01-01 1 2 2 2 2014-01-01 2 3 3 3 2014-01-01 3 8 8 8 2014-01-02 1 4 4 4 2014-01-02 2 5 5 5 2014-01-02 3 12 12 12 2014-01-03 1 6 6 6 2014-01-03 2 7 7 7 2014-01-03 3 99 99 99

How to unlock (or rotate?) In pandas

More articles: