Changing the time frequency in a Pandas Dataframe

I have a Pandas DataFrame as shown below.

df AB date_time 2014-07-01 06:03:59.614000 62.1250 NaN 2014-07-01 06:03:59.692000 62.2500 NaN 2014-07-01 06:13:34.524000 62.2500 241.0625 2014-07-01 06:13:34.602000 62.2500 241.5000 2014-07-01 06:15:05.399000 62.2500 241.3750 2014-07-01 06:15:05.399000 62.2500 241.2500 2014-07-01 06:15:42.004000 62.2375 241.2500 2014-07-01 06:15:42.082000 62.2375 241.3750 2014-07-01 06:15:42.082000 62.2375 240.2500 

I want to change the frequency of this to regular 1 minute intervals. But get the error below:

 new = df.asfreq('1Min') >>error: cannot reindex from a duplicate axis 

Now I understand why this is happening. Since my temporal granularity is high (in milliseconds), but irregular, I get a few readings per minute, even per second. So I tried to combine these millisecond readings with minutes and get rid of duplicates, as shown below.

 # try to convert the index to minutes and drop duplicates df['index'] = df.index df['minute_index']= df['index'].apply( lambda x: x.strftime('%Y-%m-%d %H:%M')) df.drop_duplicates(cols = 'minute_index', inplace = True, take_last = True) df_by_minute = df.set_index('minute_index') df_by_minute AB index minute_index 2014-07-01 06:03 62.2500 NaN 2014-07-01 06:03:59.692000 2014-07-01 06:13 62.2500 241.50 2014-07-01 06:13:34.602000 2014-07-01 06:15 62.2375 240.25 2014-07-01 06:15:42.082000 # now change the frequency to 1 minute but I just get NaNs (!) df_by_minute.asfreq('1Min') AB index 2014-07-01 06:03:00 NaN NaN NaT 2014-07-01 06:04:00 NaN NaN NaT 2014-07-01 06:05:00 NaN NaN NaT 2014-07-01 06:06:00 NaN NaN NaT 2014-07-01 06:07:00 NaN NaN NaT 2014-07-01 06:08:00 NaN NaN NaT 2014-07-01 06:09:00 NaN NaN NaT 2014-07-01 06:10:00 NaN NaN NaT 2014-07-01 06:11:00 NaN NaN NaT 2014-07-01 06:12:00 NaN NaN NaT 2014-07-01 06:13:00 NaN NaN NaT 2014-07-01 06:14:00 NaN NaN NaT 2014-07-01 06:15:00 NaN NaN NaT 

As you can see, this does not work. Can someone help? I am trying to get a function that returns A or B as of DateTime , and DateTime will be in 1Min increments.

+7
python pandas time-series time-frequency
source share
1 answer

I think that not asfreq , but resample meets your needs:

 new = df.resample('T', how='mean') 

For the how parameter, you can also use the "last" or "first".

+4
source share

All Articles