Pandas: bar chart with multi data indices

I have a pandas DataFrame with a TIMESTAMP column (not an index), and the timestamp format looks like this:

 2015-03-31 22:56:45.510 

I also have columns called CLASS and AXLES . I would like to calculate the number of records for each month separately for each unique AXLES value ( AXLES can take an integer value between 3-12).

I came up with a combination of resample and groupby :

 resamp = dfWIM.set_index('TIMESTAMP').groupby('AXLES').resample('M', how='count').CLASS 

This seems to give me a dataframe multiIndex object as shown below.

 In [72]: resamp Out [72]: AXLES TIMESTAMP 3 2014-07-31 5517 2014-08-31 31553 2014-09-30 42816 2014-10-31 49308 2014-11-30 44168 2014-12-31 45518 2015-01-31 54782 2015-02-28 52166 2015-03-31 47929 4 2014-07-31 3147 2014-08-31 24810 2014-09-30 39075 2014-10-31 46857 2014-11-30 42651 2014-12-31 48282 2015-01-31 42708 2015-02-28 43904 2015-03-31 50033 

How can I access the various components of this multiIndex object to create a bar chart for the following conditions?

  • show data when AXLES = 3
  • show x ticks in the format of the month - year (without days, hours, minutes, etc.).

Thank!

EDIT : The following code gives me the plot, but I could not change the xtick formatting to MM-YY.

 resamp[3].plot(kind='bar') 

enter image description here

EDIT 2 below is a piece of code that generates a small sample of data, similar to what I have:

 dftest = {'TIMESTAMP':['2014-08-31','2014-09-30','2014-10-31'], 'AXLES':[3, 3, 3], 'CLASS':[5,6,7]} dfTest = pd.DataFrame(dftest) dfTest.TIMESTAMP = pd.to_datetime(pd.Series(dfTest.TIMESTAMP)) resamp = dfTest.set_index('TIMESTAMP').groupby('AXLES').resample('M', how='count').CLASS resamp[3].plot(kind='bar') 

EDIT 3: The following is the solution:

A.Select the entire modified data format (based on @Ako's suggestion):

 df = resamp.unstack(0) df.index = [ts.strftime('%b 20%y') for ts in df.index] df.plot(kind='bar', rot=0) 

enter image description here

B. Enter an individual index from the modified data format (based on @Alexander's suggestion):

 df = resamp[3] df.index = [ts.strftime('%b 20%y') for ts in df.index] df.plot(kind='bar', rot=0) 

enter image description here

+4
python matplotlib pandas
Nov 11 '15 at 0:45
source share
2 answers

The following should work, but it is difficult to verify without any data.

Start by resetting the index to access the TIMESTAMP column. Then use strftime to format it to the desired textual representation (e.g. mm-yy). Finally, the reset index will revert to AXLES and TIMESTAMP .

 df = resamp.reset_index() df['TIMESTAMP'] = [ts.strftime('%m-%y') for ts in df.TIMESTAMP] df.set_index(['AXLES', 'TIMESTAMP'], inplace=True) >>> df.xs(3, level=0).plot(kind='bar') 

enter image description here

+1
Nov 11 '15 at 5:15
source share

You can generate and set shortcuts with ax.xaxis.set_major_formatter with ticker.FixedFormatter . This will allow you to save MultiIndex DataFrame data with timestamps, showing timestamps in the desired %m-%Y format:

 import numpy as np import matplotlib.pyplot as plt import pandas as pd import matplotlib.ticker as ticker dftest = {'TIMESTAMP':['2014-08-31','2014-09-30','2014-10-31'], 'AXLES':[3, 3, 3], 'CLASS':[5,6,7]} dfTest = pd.DataFrame(dftest) dfTest.TIMESTAMP = pd.to_datetime(pd.Series(dfTest.TIMESTAMP)) resamp = dfTest.set_index('TIMESTAMP').groupby('AXLES').resample('M', how='count').CLASS ax = resamp[3].plot(kind='bar') ticklabels = [timestamp.strftime('%m-%Y') for axle, timestamp in resamp.index] ax.xaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: ticklabels[int(x)])) plt.gcf().autofmt_xdate() plt.show() 

gives enter image description here

+1
Nov 11 '15 at 21:36
source share



All Articles