Pandas Group agreed levels, even if empty

Question

Pandas Group agreed levels, even if empty

I am trying to use a group to create a new data framework, but I need the multi-index to be consistent. Regardless of whether a subcategory exists, I would like it to be created as follows:

import pandas as pd df = pd.DataFrame( {'Cat 1':['A','A','A','B','B','B','B','C','C','C','C','C','D'], 'Cat 2':['A','B','A','B','B','B','A','B','B','B','B','B','A'], 'Num': [1,1,1,1,1,1,1,1,1,1,1,1,1]}) print df.groupby(['Cat 1','Cat 2']).sum()

With an output that looks like this:

  Num Cat 1 Cat 2 AA 2 B 1 BA 1 B 3 CB 5 DA 1

But I would like it to look like

  Num Cat 1 Cat 2 AA 2 B 1 BA 1 B 3 CA Nan B 5 DA 1 B Nan

I am reading different data, which then adds a column in this format, so the resulting array will look something like this:

  Num Num_added_later Cat 1 Cat 2 AA 2 12 B 1 5 BA 1 5 B 3 3 CA Nan 5 B 5 5 DA 1 1 B Nan 3

+7

python pandas group-by pandas-groupby

David folkner Feb 02 '17 at 19:57

source share

2 answers

This is hack1! Please use @Psidom answer

 df.groupby(['Cat 1','Cat 2']).sum().unstack().stack(dropna=False) Num Cat 1 Cat 2 AA 2.0 B 1.0 BA 1.0 B 3.0 CA NaN B 5.0 DA 1.0 B NaN

Well, maybe less hack, but ...

+4

piRSquared Feb 02 '17 at 20:11

source share

Psidom · Accepted Answer · 2017-02-02T20:06:43+0000

You can create a new index based on two Cat columns and re-index your results:

 import pandas as pd new_index = pd.MultiIndex.from_product([df["Cat 1"].unique(), df["Cat 2"].unique()], names = ["Cat 1", "Cat 2"]) df.groupby(['Cat 1','Cat 2']).sum().reindex(new_index)

Pandas Group agreed levels, even if empty

More articles: