Pandas dense rank

Question

Pandas dense rank

I am dealing with a pandas dataframe and have a frame like this:

Year Value 2012 10 2013 20 2013 25 2014 30

I want to make the DENSE_RANK () function equivalent (in a year). to make an extra column as follows:

  Year Value Rank 2012 10 1 2013 20 2 2013 25 2 2014 30 3

How to do it in pandas?

Thanks!

+5

python sql pandas

Halfpintboy Sep 06 '16 at 21:09

source share

3 answers

You can convert the year into categorical ones, and then take your codes (adding one of them, since they are zero indexed, and you want the initial value to start with one in your example).

 df['Rank'] = df.Year.astype('category').cat.codes + 1 >>> df Year Value Rank 0 2012 10 1 1 2013 20 2 2 2013 25 2 3 2014 30 3

+4

Alexander Sep 06 '16 at 21:14

source share

The fastest factorize solution:

 df['Rank'] = pd.factorize(df.Year)[0] + 1

Delay

 #len(df)=40k df = pd.concat([df]*10000).reset_index(drop=True) In [13]: %timeit df['Rank'] = df.Year.rank(method='dense').astype(int) 1000 loops, best of 3: 1.55 ms per loop In [14]: %timeit df['Rank1'] = df.Year.astype('category').cat.codes + 1 1000 loops, best of 3: 1.22 ms per loop In [15]: %timeit df['Rank2'] = pd.factorize(df.Year)[0] + 1 1000 loops, best of 3: 737 µs per loop

+4

jezrael Sep 7 '16 at 11:23

source share

piRSquared · Accepted Answer · 2016-09-06T21:26:19+0000

Use pd.Series.rank with method='dense'

 df['Rank'] = df.Year.rank(method='dense').astype(int) df

Pandas dense rank

More articles: