Error loading python numpy loadtxt with date

I am trying to use numpy loadtxt to load a csv file into an array. But it looks like I can't load the date correctly.

The following shows what is happening. Did I do something wrong?

>>> s = StringIO("05/21/2007,03:27") >>> np.loadtxt(s, delimiter=",", dtype={'names':('date','time'), 'formats':('datetime64[D]', 'datetime64[m]')}) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/npyio.py", line 796, in loadtxt items = [conv(val) for (conv, val) in zip(converters, vals)] File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/npyio.py", line 573, in <lambda> return lambda x: int(float(x)) ValueError: invalid literal for float(): 05/21/2007 
+4
source share
2 answers

You also need to add converters, for example:

 from matplotlib.dates import strpdate2num ... np.loadtxt(s, delimiter=",", converters={0:strpdate2num('%m/%d/%Y'), 1:...}, dtype= ... 

When numpy sees your dtype datetime format [64], it prepares the output of a column of type numpy.datetime64. numpy.datetim64 is a subclass of numpy.integer, and loadtxt is preparing to treat this column as an integer with the following:

 def _getconv(dtype): typ = dtype.type if issubclass(typ, np.bool_): return lambda x: bool(int(x)) if issubclass(typ, np.uint64): return np.uint64 if issubclass(typ, np.int64): return np.int64 if issubclass(typ, np.integer): return lambda x: int(float(x)) ... 

When it reaches the point of trying to convert on line 796 to numpyio:

 items = [conv(val) for (conv, val) in zip(converters, vals)] 

he is trying to use lambda x: int(float(x)) to handle input. When he does this, he tries to indicate the date (05/27/2007) on float and peters out. The strpdate2num conversion function above converts the date to a numeric representation.

+2
source

Attempting to solve MichealJCox did not work for me. My version of numpy (1.8) did not accept the time number given by strpdate2num('%m/%d/%Y') , it would only accept a date string or a datetime object. Therefore, I used a more sophisticated converter that converts the temporary string to a number of time, and then to a datetime object that can be used with numpy:

 from matplotlib.dates import strpdate2num, num2date ... convert = lambda x: num2date(strpdate2num('%m/%d/%Y')(x)) np.loadtxt(s, delimiter=",", converters={0:convert}, dtype= ... 

This seems like a cumbersome decision.

+2
source

All Articles