Python pandas datetime.time - datetime.time

I have a dataframe that contains two columns of datetime.time elements. something like

col1 col2 02:10:00.008209 02:08:38.053145 02:10:00.567054 02:08:38.053145 02:10:00.609842 02:08:38.053145 02:10:00.728153 02:08:38.053145 02:10:02.394408 02:08:38.053145 

how can i create col3 which is the difference between col1 and col2? (preferably in microseconds)?

I searched, but I can not find a solution here. Somebody knows?

Thanks!

+7
python pandas datetime
source share
3 answers

do not use datetime.time , use timedelta :

 import pandas as pd import io data = """col1 col2 02:10:00.008209 02:08:38.053145 02:10:00.567054 02:08:38.053145 02:10:00.609842 02:08:38.053145 02:10:00.728153 02:08:38.053145 02:10:02.394408 02:08:38.053145""" df = pd.read_table(io.BytesIO(data), delim_whitespace=True) df2 = df.apply(pd.to_timedelta) diff = df2.col1 - df2.col2 diff.astype("i8")/1e9 

the output differs in seconds:

 0 81.955064 1 82.513909 2 82.556697 3 82.675008 4 84.341263 dtype: float64 

To convert a data time frame to a timedelta dataframe:

 df.applymap(time.isoformat).apply(pd.to_timedelta) 
+3
source share

Are you sure you want to use a DataFrame for datetime.time objects? There is hardly an operation that you can conveniently perform on these guys, especially when it's wrapped in a DataFrame.

It might be better for each column to keep int, representing the total number of microseconds.

You can convert df to a DataFrame data store, which is stored in microseconds, for example:

 In [71]: df2 = df.applymap(lambda x: ((x.hour*60+x.minute)*60+x.second)*10**6+x.microsecond) In [72]: df2 Out[72]: col1 col2 0 7800008209 7718053145 1 7800567054 7718053145 

And from there it is easy to get the desired result:

 In [73]: df2['col1']-df2['col2'] Out[73]: 0 81955064 1 82513909 dtype: int64 
+1
source share

pandas converts datetime objects to np.datetime64 objects, the differences of which are np.timedelta64 objects.

Consider this

 In [30]: df Out[30]: 0 1 0 2014-02-28 13:30:19.926778 2014-02-28 13:30:47.178474 1 2014-02-28 13:30:29.814575 2014-02-28 13:30:51.183349 

I can consider the difference in columns on

  df[0] - df[1] Out[31]: 0 -00:00:27.251696 1 -00:00:21.368774 dtype: timedelta64[ns] 

and therefore I can apply the timedelta64 transformations. For microseconds

 (df[0] - df[1]).apply(lambda x : x.astype('timedelta64[us]')) #no actual difference when displayed 

or microseconds as integers

 (df[0] - df[1]).apply(lambda x : x.astype('timedelta64[us]').astype('int')) 0 -27251696000 1 -21368774000 dtype: int64 

EDIT: As suggested by @Jeff, recent expressions can be abbreviated as

 (df[0] - df[1]).astype('timedelta64[us]') 

and

 (df[0] - df[1]).astype('timedelta64[us]').astype('int') 

for pandas> = .13.

+1
source share

All Articles