How to implement my own describe () function for use in resample ()

I work with timeseries data that represent vectors (magnitud and direction). I want to resample my data and use the describe function as a how parameter.

However, the describe method uses a standard average value, and I want to use a special function for the average direction. Because of this, I applied my own describe method, based on the implementation of pandas.Series.describe() :

 def directionAverage(x): result = np.arctan2(np.mean(np.sin(x)), np.mean(np.cos(x))) if result < 0: result += 2*np.pi return result def directionDescribe(x): data = [directionAverage(x), x.std(), x.min(), x.quantile(0.25), x.median(), x.quantile(0.75), x.max()] names = ['mean', 'std', 'min', '25%', '50%', '75%', 'max'] return Series(data, index=names) 

The problem is that when I do this:

 df['direction'].resample('10Min', how=directionDescribe) 

I get this exception (showing the last few lines):

  File "C:\Python26\lib\site-packages\pandas\core\generic.py", line 234, in resample return sampler.resample(self) File "C:\Python26\lib\site-packages\pandas\tseries\resample.py", line 83, in resample rs = self._resample_timestamps(obj) File "C:\Python26\lib\site-packages\pandas\tseries\resample.py", line 217, in _resample_timestamps result = grouped.aggregate(self._agg_method) File "C:\Python26\lib\site-packages\pandas\core\groupby.py", line 1626, in aggregate result = self._aggregate_generic(arg, *args, **kwargs) File "C:\Python26\lib\site-packages\pandas\core\groupby.py", line 1681, in _aggregate_generic return self._aggregate_item_by_item(func, *args, **kwargs) File "C:\Python26\lib\site-packages\pandas\core\groupby.py", line 1706, in _aggregate_item_by_item result[item] = colg.aggregate(func, *args, **kwargs) File "C:\Python26\lib\site-packages\pandas\core\groupby.py", line 1357, in aggregate result = self._aggregate_named(func_or_funcs, *args, **kwargs) File "C:\Python26\lib\site-packages\pandas\core\groupby.py", line 1441, in _aggregate_named raise Exception('Must produce aggregated value') 

The question arises: how to implement your own describe function so that it works with resample ?

+4
source share
1 answer

Okay, I think I get it. Instead of groupby , you can groupby where the group is a unit of time. You can apply a function of your choice to this group, for example, the directionAverage function.

Please note that I import the TimeGrouper function to allow grouping by time intervals.

 import pandas as pd import numpy as np from pandas.tseries.resample import TimeGrouper #group your data new_data = df['direction'].groupby(TimeGrouper('10min')) #apply your function to the grouped data new_data.apply(directionDescribe) 
+2
source

All Articles