Pandas work day bias performance

For a ton of dates, I need to calculate the next business day when I take into account the holidays.

I am currently using something like the code below that I inserted from an IPython laptop:

import pandas as pd from pandas.tseries.holiday import USFederalHolidayCalendar cal = USFederalHolidayCalendar() bday_offset = lambda n: pd.datetools.offsets.CustomBusinessDay(n, calendar=cal) mydate = pd.to_datetime("12/24/2014") %timeit with_holiday = mydate + bday_offset(1) %timeit without_holiday = mydate + pd.datetools.offsets.BDay(1) 

On my computer, the with_holiday line runs in ~ 12 milliseconds; and the string without_holiday runs in ~ 15 microseconds.

Is there a way to speed up bday_offset ?

+5
source share
1 answer

I think the way you implement it through lambda slows it down. Consider this method (taken more or less directly from documentaion )

 from pandas.tseries.offsets import CustomBusinessDay bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar()) mydate + bday_us Out[13]: Timestamp('2014-12-26 00:00:00') 

The first part is slow, but you only need to do it once. The second part is very fast, though.

 %timeit bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar()) 10 loops, best of 3: 66.5 ms per loop %timeit mydate + bday_us 10000 loops, best of 3: 44 ยตs per loop 

To get apples for apples, here are other timings on my machine:

 %timeit with_holiday = mydate + bday_offset(1) 10 loops, best of 3: 23.1 ms per loop %timeit without_holiday = mydate + pd.datetools.offsets.BDay(1) 10000 loops, best of 3: 36.6 ยตs per loop 
+5
source

All Articles