I just updated pandas from 0.17.1 to 0.18.1 and I think I found a problem with the new resampling methodology described below when changing any existing code. According to this documentation, df3_resample and df4_resample in my example below should return the same data file, however df4_resample throws an exception. It helped me a bit, so I decided that I would share it.
Exception: Column(s) A already selected
http://pandas.pydata.org/pandas-docs/version/0.18.0/whatsnew.html#whatsnew-0180-breaking-resample
http://pandas.pydata.org/pandas-docs/version/0.18.1/whatsnew.html#groupby-syntax-with-window-and-resample-operations
df = pd.DataFrame(np.random.rand(10,4), columns=list('ABCD'), index=pd.date_range('2010-01-01 09:00:00', periods=10, freq='s')) df['item'] = 'item_a' # add column for groupby # THIS WORKS df1_resample = df.groupby('item').resample('2s').agg({'A': np.mean, 'B': np.max}).reset_index() print df1_resample # THIS WORKS df2_resample = df.resample('2s').agg({'A': {'A_mean': np.mean, 'A_max': np.max}}).reset_index() print df2_resample # THIS WORKS df3_resample = df.groupby('item').apply(lambda x: x.resample('2s').agg({'A': {'A_mean': np.mean, 'A_max': np.max}})).reset_index() print df3_resample # THIS DOESN"T WORKS df4_resample = df.groupby('item').resample('2s').agg({'A': {'A_mean': np.mean, 'A_max': np.max}}) print df4_resample
Output:
item level_1 AB 0 item_a 2010-01-01 09:00:00 0.611660 0.739640 1 item_a 2010-01-01 09:00:02 0.615876 0.880113 2 item_a 2010-01-01 09:00:04 0.218292 0.441504 3 item_a 2010-01-01 09:00:06 0.753698 0.637787 4 item_a 2010-01-01 09:00:08 0.471272 0.474738 index A A_mean A_max 0 2010-01-01 09:00:00 0.611660 0.813038 1 2010-01-01 09:00:02 0.615876 0.994657 2 2010-01-01 09:00:04 0.218292 0.233478 3 2010-01-01 09:00:06 0.753698 0.848107 4 2010-01-01 09:00:08 0.471272 0.610592 item level_1 A A_mean A_max 0 item_a 2010-01-01 09:00:00 0.611660 0.813038 1 item_a 2010-01-01 09:00:02 0.615876 0.994657 2 item_a 2010-01-01 09:00:04 0.218292 0.233478 3 item_a 2010-01-01 09:00:06 0.753698 0.848107 4 item_a 2010-01-01 09:00:08 0.471272 0.610592 File "<some_file.py>", line 29, in <module> df4_resample = df.groupby('item').resample('2s').agg({'A': {'A_mean': np.mean, 'A_max': np.max}}) File "C:\Anaconda2\lib\site-packages\pandas\tseries\resample.py", line 293, in aggregate result, how = self._aggregate(arg, *args, **kwargs) File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 505, in _aggregate result = list(_agg(arg, _agg_1dim).values()) File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 496, in _agg result[fname] = func(fname, agg_how) File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 479, in _agg_1dim return colg.aggregate(how, _level=(_level or 0) + 1) File "C:\Anaconda2\lib\site-packages\pandas\tseries\resample.py", line 293, in aggregate result, how = self._aggregate(arg, *args, **kwargs) File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 528, in _aggregate result = _agg(arg, lambda fname, File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 496, in _agg result[fname] = func(fname, agg_how) File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 529, in <lambda> agg_how: _agg_1dim(self._selection, agg_how)) File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 475, in _agg_1dim colg = self._gotitem(name, ndim=1, subset=subset) File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 680, in _gotitem groupby=self._groupby[key], File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 326, in __getitem__ raise Exception('Column(s) %s already selected' % self._selection) Exception: Column(s) A already selected