Elegant way to convert a numpy array containing datetime.timedelta to seconds in python 2.7

I have a numpy array called dt . Each element is of type datetime.timedelta . For example:

 >>>dt[0] datetime.timedelta(0, 1, 36000) 

How to convert dt to a dt_sec array that contains only seconds without a loop? my current solution (which works, but I don't like it):

 dt_sec = zeros((len(dt),1)) for i in range(0,len(dt),1): dt_sec[i] = dt[i].total_seconds() 

I tried to use dt.total_seconds() , but of course this did not work. any idea on how to avoid this loop?

thanks

+8
python arrays loops numpy datetime
source share
4 answers
 import numpy as np helper = np.vectorize(lambda x: x.total_seconds()) dt_sec = helper(dt) 
+9
source share

numpy has its own datetime and timedelta . Just use them;).

Settings, for example:

 import datetime import numpy times = numpy.array([datetime.timedelta(0, 1, 36000)]) 

the code:

 times.astype("timedelta64[ms]").astype(int) / 1000 #>>> array([ 1.036]) 

Since people do not seem to understand that this is the best solution, here are some timings of the timedelta64 array vs a datetime.datetime :

 SETUP=" import datetime import numpy times = numpy.array([datetime.timedelta(0, 1, 36000)] * 100000) numpy_times = times.astype('timedelta64[ms]') " python -m timeit -s "$SETUP" "numpy_times.astype(int) / 1000" python -m timeit -s "$SETUP" "numpy.vectorize(lambda x: x.total_seconds())(times)" python -m timeit -s "$SETUP" "[delta.total_seconds() for delta in times]" 

Results:

 100 loops, best of 3: 4.54 msec per loop 10 loops, best of 3: 99.5 msec per loop 10 loops, best of 3: 67.1 msec per loop 

The initial translation will take about twice as much time as the vectorized expression, but each operation from now on to infinity in this timedelta array will be about 20 times faster.


If you will never use these timedelta , think about why you ever made a delta (as opposed to timedelta64 s), and then used the expression numpy.vectorize . This is less native, but for some reason it is faster.

+10
source share

I like using np.vectorize as suggested by prgao . If you just want a Python list, you can also do the following:

 dt_sec = map(datetime.timedelta.total_seconds, dt) 
0
source share

You can use list comprehension:

 dt_sec = [delta.total_seconds() for delta in dt] 

Behind the scenes, numpy should translate this to a fairly quick operation.

-2
source share

All Articles