I am trying to use a group to create a new data framework, but I need the multi-index to be consistent. Regardless of whether a subcategory exists, I would like it to be created as follows:
import pandas as pd df = pd.DataFrame( {'Cat 1':['A','A','A','B','B','B','B','C','C','C','C','C','D'], 'Cat 2':['A','B','A','B','B','B','A','B','B','B','B','B','A'], 'Num': [1,1,1,1,1,1,1,1,1,1,1,1,1]}) print df.groupby(['Cat 1','Cat 2']).sum()
With an output that looks like this:
Num Cat 1 Cat 2 AA 2 B 1 BA 1 B 3 CB 5 DA 1
But I would like it to look like
Num Cat 1 Cat 2 AA 2 B 1 BA 1 B 3 CA Nan B 5 DA 1 B Nan
I am reading different data, which then adds a column in this format, so the resulting array will look something like this:
Num Num_added_later Cat 1 Cat 2 AA 2 12 B 1 5 BA 1 5 B 3 3 CA Nan 5 B 5 5 DA 1 1 B Nan 3