In general, if the dates are completely arbitrary, I think you have to use Python for-loop line by line or use df.apply , (which also uses the Python loop under the hood.)
However, if your dates have a common frequency, as in the case above, then there is a trick that should be much faster than using df.apply : expand the timers according to the total frequency - in this case 1 minute - enter NaN with zeros, and then call rolling_sum :
In [279]: pd.rolling_sum(df.set_index(['Date']).asfreq('1T').fillna(0), window=5, min_periods=1).reindex(df['Date']) Out[279]: A Date 2014-11-21 11:00:00 1 2014-11-21 11:03:00 5 2014-11-21 11:04:00 6 2014-11-21 11:05:00 7 2014-11-21 11:07:00 11 2014-11-21 11:08:00 8 2014-11-21 11:12:00 2 2014-11-21 11:13:00 3
Of course, any time series has a common frequency, if you are ready to accept a fairly small amount of detail, but the required size of df.asfreq(...) can make this trick impractical.
Here is an example of a more general approach using df.apply . Note that calling searchsorted dependent on df['Date'] in sorted order.
import numpy as np import pandas as pd df = pd.read_csv('data', parse_dates=[0], sep=',\s*') start_dates = df['Date'] - pd.Timedelta(minutes=5) df['start_index'] = df['Date'].values.searchsorted(start_dates, side='right') df['end_index'] = np.arange(len(df)) def sum_window(row): return df['A'].iloc[row['start_index']:row['end_index']+1].sum() df['rolling_sum'] = df.apply(sum_window, axis=1) print(df[['Date', 'A', 'rolling_sum']])
gives
Date A rolling_sum 0 2014-11-21 11:00:00 1 1 1 2014-11-21 11:03:00 4 5 2 2014-11-21 11:04:00 1 6 3 2014-11-21 11:05:00 2 7 4 2014-11-21 11:07:00 4 11 5 2014-11-21 11:08:00 1 8 6 2014-11-21 11:12:00 1 2 7 2014-11-21 11:13:00 2 3
Here is a benchmark comparing the df.asfreq trick and the df.apply call:
import numpy as np import pandas as pd df = pd.read_csv('data', parse_dates=[0], sep=',\s*') def big_df(df): df = df.copy() for i in range(7): dates = df['Date'] + pd.Timedelta(df.iloc[-1]['Date']-df.iloc[0]['Date']) + pd.Timedelta('1 minute') df2 = pd.DataFrame({'Date': dates, 'A': df['A']}) df = pd.concat([df, df2]) df = df.reset_index(drop=True) return df def using_apply(): start_dates = df['Date'] - pd.Timedelta(minutes=5) df['start_index'] = df['Date'].values.searchsorted(start_dates, side='right') df['end_index'] = np.arange(len(df)) def sum_window(row): return df['A'].iloc[row['start_index']:row['end_index']+1].sum() df['rolling_sum'] = df.apply(sum_window, axis=1) return df[['Date', 'rolling_sum']] def using_asfreq(): result = (pd.rolling_sum( df.set_index(['Date']).asfreq('1T').fillna(0), window=5, min_periods=1).reindex(df['Date'])) return result
In [364]: df = big_df(df) In [367]: %timeit using_asfreq() 1000 loops, best of 3: 1.21 ms per loop In [368]: %timeit using_apply() 1 loops, best of 3: 208 ms per loop