How to group and count rows by month and year using Pandas?

Question

How to group and count rows by month and year using Pandas?

I have a data set with personal data such as name, height, weight and date of birth. I would build a graph with the number of people born in a particular month and year. I use python pandas to accomplish this, and my strategy was to try to group by year and month and add a counter. But the closest I got is to count the number of people by year or month, but not both.

df['birthdate'].groupby(df.birthdate.dt.year).agg('count')

Other questions in stackoverflow point to a group called TimeGrouper, but a search in the pandas documentation did not find anything. Any idea?

+12

python pandas

nsbm Aug 05 '16 at 14:49

source share

5 answers

You can also use the "month" period with to_period with to_period dt :

 In [11]: df = pd.DataFrame({'birthdate': pd.date_range(start='20-12-2015', end='3-1-2016')}) In [12]: df['birthdate'].groupby(df.birthdate.dt.to_period("M")).agg('count') Out[12]: birthdate 2015-12 12 2016-01 31 2016-02 29 2016-03 1 Freq: M, Name: birthdate, dtype: int64

It is worth noting that if datetime is an index (not a column), you can use resample :

 df.resample("M").count()

+11

Andy hayden Oct 7 '17 at 20:10

source share

Another solution is to set birthdate as an index and reselect:

 import pandas as pd df = pd.DataFrame({'birthdate': pd.date_range(start='20-12-2015', end='3-1-2016')}) df.set_index('birthdate').resample('MS').size()

Output:

 birthdate 2015-12-01 12 2016-01-01 31 2016-02-01 29 2016-03-01 1 Freq: MS, dtype: int64

+9

Alberto garcia-raboso Aug 05 '16 at 15:06

source share

As of April 2019: this will work. Version for pandas - 0.24.x

df.groupby([df.dates.dt.year.rename('year'), df.dates.dt.month.rename('month')]).size()

0

saran3h Apr 22 '19 at 13:23

source share

Replace the date and quantity fields with the appropriate column names. This code fragment will group, summarize and sort based on the given parameters. You can also change the frequency to 1M or 2M and so on ...

 df[['date', 'count']].groupby(pd.Grouper(key='date', freq='1M')).sum().sort_values(by='date', ascending=True)['count']

0

user1775015 Jun 14 '19 at 7:12

source share

Edchum · Accepted Answer · 2016-08-05T14:52:27+0000

To group by multiple criteria, pass a list of columns or criteria:

 df['birthdate'].groupby([df.birthdate.dt.year, df.birthdate.dt.month]).agg('count')

Example:

 In [165]: df = pd.DataFrame({'birthdate':pd.date_range(start=dt.datetime(2015,12,20),end=dt.datetime(2016,3,1))}) df.groupby([df['birthdate'].dt.year, df['birthdate'].dt.month]).agg({'count'}) Out[165]: birthdate count birthdate birthdate 2015 12 12 2016 1 31 2 29 3 1

UPDATE

Starting with version 0.23.0 , the above code no longer works due to the restriction that the names of 0.23.0 levels must be unique, now you need rename levels for this to work:

 In[107]: df.groupby([df['birthdate'].dt.year.rename('year'), df['birthdate'].dt.month.rename('month')]).agg({'count'}) Out[107]: birthdate count year month 2015 12 12 2016 1 31 2 29 3 1

How to group and count rows by month and year using Pandas?

More articles: