I have a dataframe sorted by date :
df = pd.DataFrame({'idx': [1, 1, 1, 2, 2, 2], 'date': ['2016-04-30', '2016-05-31', '2016-06-31', '2016-04-30', '2016-05-31', '2016-06-31'], 'val': [10, 0, 5, 10, 0, 0], 'pct_val': [None, -10, None, None, -10, -10]}) df = df.sort('date') print df date idx pct_val val 3 2016-04-30 2 NaN 10 0 2016-04-30 1 NaN 10 4 2016-05-31 2 -10 0 1 2016-05-31 1 -10 0 5 2016-06-31 2 -10 0 2 2016-06-31 1 NaN 5
And I want to group using idx , then apply the cumulative function with some simple logic. If pct_val is null, add val to to total total, otherwise multiply the total by 1 + pct_val/100 . 'cumsum' shows the result of df.groupby('idx').val.cumsum() , and 'cumulative_func' 'cumsum' shows the result I want.
date idx pct_val val cumsum cumulative_func 3 2016-04-30 2 NaN 10 10 10 0 2016-04-30 1 NaN 10 10 10 4 2016-05-31 2 -10 0 10 9 1 2016-05-31 1 -10 0 10 9 5 2016-06-31 2 -10 0 10 8 2 2016-06-31 1 NaN 5 15 14
Any idea, if there is a way to do it, apply a custom cumulative function to a data frame, or is there a better way to achieve this?