Pandas: checking the holiday date and assigning a boolean

I have a pandas data frame with a date column, and I'm trying to add a new boolean column indicating whether this date is a holiday or not.

Below is the code, but it doesn’t work (all values ​​are False), because the types seem to be different, and I can’t figure out how to get the β€œdate” in the pandas data frame to be of the same type as the holidays:

cal = USFederalHolidayCalendar() holidays = cal.holidays(start=train_df['date'].min(), end=train_df['date'].max()).to_pydatetime() train_df['holiday'] = train_df['date'].isin(holidays) print type(train_df['date'][1]) print type(holidays[0]) 
+8
python pandas
source share
2 answers

You do not need to convert anything. Just compare directly. pandas is smart enough to compare many different types with respect to dates and times. You should have a slightly more esoteric format if you have date / time compatibility issues.

 import pandas as pd from pandas.tseries.holiday import USFederalHolidayCalendar as calendar dr = pd.date_range(start='2015-07-01', end='2015-07-31') df = pd.DataFrame() df['Date'] = dr cal = calendar() holidays = cal.holidays(start=dr.min(), end=dr.max()) df['Holiday'] = df['Date'].isin(holidays) print df 

Result:

  Date Holiday 0 2015-07-01 False 1 2015-07-02 False 2 2015-07-03 True 3 2015-07-04 False 4 2015-07-05 False 5 2015-07-06 False 6 2015-07-07 False 7 2015-07-08 False 8 2015-07-09 False 9 2015-07-10 False 10 2015-07-11 False 11 2015-07-12 False 12 2015-07-13 False 13 2015-07-14 False 14 2015-07-15 False 15 2015-07-16 False 16 2015-07-17 False 17 2015-07-18 False 18 2015-07-19 False 19 2015-07-20 False 20 2015-07-21 False 21 2015-07-22 False 22 2015-07-23 False 23 2015-07-24 False 24 2015-07-25 False 25 2015-07-26 False 26 2015-07-27 False 27 2015-07-28 False 28 2015-07-29 False 29 2015-07-30 False 30 2015-07-31 False 

Please note that July 4, 2015 is Saturday.

+14
source share

I had the same problem as the author, and another fix did not work for me. Here is what worked:

 train_df['holiday'] = train_df['date'].dt.date.astype('datetime64').isin(holidays) 
+2
source share

All Articles