Pandas Group agreed levels, even if empty

I am trying to use a group to create a new data framework, but I need the multi-index to be consistent. Regardless of whether a subcategory exists, I would like it to be created as follows:

import pandas as pd df = pd.DataFrame( {'Cat 1':['A','A','A','B','B','B','B','C','C','C','C','C','D'], 'Cat 2':['A','B','A','B','B','B','A','B','B','B','B','B','A'], 'Num': [1,1,1,1,1,1,1,1,1,1,1,1,1]}) print df.groupby(['Cat 1','Cat 2']).sum() 

With an output that looks like this:

  Num Cat 1 Cat 2 AA 2 B 1 BA 1 B 3 CB 5 DA 1 

But I would like it to look like

  Num Cat 1 Cat 2 AA 2 B 1 BA 1 B 3 CA Nan B 5 DA 1 B Nan 

I am reading different data, which then adds a column in this format, so the resulting array will look something like this:

  Num Num_added_later Cat 1 Cat 2 AA 2 12 B 1 5 BA 1 5 B 3 3 CA Nan 5 B 5 5 DA 1 1 B Nan 3 
+7
python pandas group-by pandas-groupby
source share
2 answers

You can create a new index based on two Cat columns and re-index your results:

 import pandas as pd new_index = pd.MultiIndex.from_product([df["Cat 1"].unique(), df["Cat 2"].unique()], names = ["Cat 1", "Cat 2"]) df.groupby(['Cat 1','Cat 2']).sum().reindex(new_index) 

enter image description here

+5
source share

This is hack1! Please use @Psidom answer

 df.groupby(['Cat 1','Cat 2']).sum().unstack().stack(dropna=False) Num Cat 1 Cat 2 AA 2.0 B 1.0 BA 1.0 B 3.0 CA NaN B 5.0 DA 1.0 B NaN 

Well, maybe less hack, but ...

enter image description here

+4
source share

All Articles