I have a Pandas DataFrame as shown below.
df AB date_time 2014-07-01 06:03:59.614000 62.1250 NaN 2014-07-01 06:03:59.692000 62.2500 NaN 2014-07-01 06:13:34.524000 62.2500 241.0625 2014-07-01 06:13:34.602000 62.2500 241.5000 2014-07-01 06:15:05.399000 62.2500 241.3750 2014-07-01 06:15:05.399000 62.2500 241.2500 2014-07-01 06:15:42.004000 62.2375 241.2500 2014-07-01 06:15:42.082000 62.2375 241.3750 2014-07-01 06:15:42.082000 62.2375 240.2500
I want to change the frequency of this to regular 1 minute intervals. But get the error below:
new = df.asfreq('1Min') >>error: cannot reindex from a duplicate axis
Now I understand why this is happening. Since my temporal granularity is high (in milliseconds), but irregular, I get a few readings per minute, even per second. So I tried to combine these millisecond readings with minutes and get rid of duplicates, as shown below.
# try to convert the index to minutes and drop duplicates df['index'] = df.index df['minute_index']= df['index'].apply( lambda x: x.strftime('%Y-%m-%d %H:%M')) df.drop_duplicates(cols = 'minute_index', inplace = True, take_last = True) df_by_minute = df.set_index('minute_index') df_by_minute AB index minute_index 2014-07-01 06:03 62.2500 NaN 2014-07-01 06:03:59.692000 2014-07-01 06:13 62.2500 241.50 2014-07-01 06:13:34.602000 2014-07-01 06:15 62.2375 240.25 2014-07-01 06:15:42.082000
As you can see, this does not work. Can someone help? I am trying to get a function that returns A or B as of DateTime , and DateTime will be in 1Min increments.
python pandas time-series time-frequency
Rhubarb
source share