description is the name of the columns. You can get rid of this:
In [74]: dfUnstackedNoIndex.columns.name = None In [75]: dfUnstackedNoIndex Out[75]: state year thing1 thing2 0 a 1 4 NaN 1 a 2 3 1 2 b 1 2 4 3 b 2 NaN 6
The assignment of column names may become clearer when you look at what happens when you double pull together:
In [107]: dfUnstacked2 = dfUnstacked.unstack('state') In [108]: dfUnstacked2 Out[108]: description thing1 thing2 state abab year 1 4 2 NaN 4 2 3 NaN 1 6
Now dfUnstacked2.columns is MultiIndex . Each level has a name that corresponds to the name of the index level that has been converted to the column level.
In [111]: dfUnstacked2.columns Out[111]: MultiIndex(levels=[[u'thing1', u'thing2'], [u'a', u'b']], labels=[[0, 0, 1, 1], [0, 1, 0, 1]], names=[u'description', u'state'])
Column names and index names appear in the same place in the row representation of DataFrames, so it can be difficult to know what exactly. You can figure this out by checking df.index.names and df.columns.names .
unutbu
source share