How to filter numpy.ndarray by date?

I have a 2d numpy.array where the first column contains datetime.datetime objects and the second integer number of columns:

A = array([[2002-03-14 19:57:38, 197], [2002-03-17 16:31:33, 237], [2002-03-17 16:47:18, 238], [2002-03-17 18:29:31, 239], [2002-03-17 20:10:11, 240], [2002-03-18 16:18:08, 252], [2002-03-23 23:44:38, 327], [2002-03-24 09:52:26, 334], [2002-03-25 16:04:21, 352], [2002-03-25 18:53:48, 353]], dtype=object) 

What I would like to do is select all rows for a specific date, something like

 A[first_column.date()==datetime.date(2002,3,17)] array([[2002-03-17 16:31:33, 237], [2002-03-17 16:47:18, 238], [2002-03-17 18:29:31, 239], [2002-03-17 20:10:11, 240]], dtype=object) 

How can i achieve this?

Thank you for understanding:)

+7
python numpy scipy datetime
source share
2 answers

You can do it:

 from_date=datetime.datetime(2002,3,17,0,0,0) to_date=from_date+datetime.timedelta(days=1) idx=(A[:,0]>from_date) & (A[:,0]<=to_date) print(A[idx]) # array([[2002-03-17 16:31:33, 237], # [2002-03-17 16:47:18, 238], # [2002-03-17 18:29:31, 239], # [2002-03-17 20:10:11, 240]], dtype=object) 

A[:,0] is the first column of A

Unfortunately, comparing A[:,0] with a datetime.date object raises a TypeError. However, a comparison with the datetime.datetime object works:

 In [63]: A[:,0]>datetime.datetime(2002,3,17,0,0,0) Out[63]: array([False, True, True, True, True, True, True, True, True, True], dtype=bool) 

In addition, unfortunately,

 datetime.datetime(2002,3,17,0,0,0)<A[:,0]<=datetime.datetime(2002,3,18,0,0,0) 

also raises a TypeError, since it calls the datetime.datetime __lt__ method instead of the numpy array __lt__ . Perhaps this is a mistake.

In any case, this is not difficult to get around; you can say

 In [69]: (A[:,0]>datetime.datetime(2002,3,17,0,0,0)) & (A[:,0]<=datetime.datetime(2002,3,18,0,0,0)) Out[69]: array([False, True, True, True, True, False, False, False, False, False], dtype=bool) 

Since this gives you a boolean array, you can use it as a "fancy index" for A , which gives the desired result.

+4
source share
 from datetime import datetime as dt, timedelta as td import numpy as np # Create 2-d numpy array d1 = dt.now() d2 = dt.now() d3 = dt.now() - td(1) d4 = dt.now() - td(1) d5 = d1 + td(1) arr = np.array([[d1, 1], [d2, 2], [d3, 3], [d4, 4], [d5, 5]]) # Here we will extract all the data for today, so get date range in datetime dtx = d1.replace(hour=0, minute=0, second=0, microsecond=0) dty = dtx + td(hours=24) # Condition cond = np.logical_and(arr[:, 0] >= dtx, arr[:, 0] < dty) # Full array print arr # Extracted array for the range print arr[cond, :] 
+2
source share

All Articles