What is as_index in a group in pandas?

What is the as_index function in groupby in groupby ?

+15
python pandas
source share
2 answers

print() your friend when you don't understand anything. This dispels doubt many times.

Take a look:

 import pandas as pd df = pd.DataFrame(data={'books':['bk1','bk1','bk1','bk2','bk2','bk3'], 'price': [12,12,12,15,15,17]}) print(df) print(df.groupby('books', as_index=True).sum()) print(df.groupby('books', as_index=False).sum()) 

Exit:

  books price 0 bk1 12 1 bk1 12 2 bk1 12 3 bk2 15 4 bk2 15 5 bk3 17 price books bk1 36 bk2 30 bk3 17 books price 0 bk1 36 1 bk2 30 2 bk3 17 

When as_index=True keys you use in groupby() will become an index in the new data frame.

The benefits you get by setting a column as an index:

  1. Speed. When you filter values ​​based on an index column, for example. df.loc['bk1'] , this would be faster due to hashing of the index column. No need to go through the entire column of books to find 'bk1' . It will simply calculate the hash value of 'bk1' and find it at a time.

  2. Ease. When as_index=True you can use this syntax df.loc['bk1'] which is shorter and faster, unlike df.loc[df.books=='bk1'] which is longer and slower.

+36
source share

When using a group using the as_index function, it can be set to true or false depending on whether you want the column with which you were grouped to be the output index.

 import pandas as pd table_r = pd.DataFrame({ 'colors': ['orange', 'red', 'orange', 'red'], 'price': [1000, 2000, 3000, 4000], 'quantity': [500, 3000, 3000, 4000], }) new_group = table_r.groupby('colors',as_index=True).count().sort('price', ascending=False) print new_group 

exit:

  price quantity colors orange 2 2 red 2 2 

Now with as_index = False

  colors price quantity 0 orange 2 2 1 red 2 2 

Note that colors are no longer an index when changing as_index = False

+5
source share

All Articles