How to specify dtype index when reading csv file in DataFrame?

In python 3.4.3 and Pandas 0.16, how to specify index dtype as str ? The following code is what I tried:

 In [1]: from io import StringIO In [2]: import pandas as pd In [3]: import numpy as np In [4]: fra = pd.read_csv(StringIO('date,close\n20140101,10.2\n20140102,10.5'), index_col=0, dtype={'date': np.str_, 'close': np.float}) In [5]: fra.index Out[5]: Int64Index([20140101, 20140102], dtype='int64') 
+4
source share
1 answer

It looks like param index_col=0 takes precedence over the dtype parameter, if you drop the index_col parameter, then you can call set_index after:

 In [235]: fra = pd.read_csv(io.StringIO('date,close\n20140101,10.2\n20140102,10.5'), dtype={'date': np.str_, 'close': np.float}) fra Out[235]: date close 0 20140101 10.2 1 20140102 10.5 In [236]: fra = fra.set_index('date') fra.index Out[236]: Index(['20140101', '20140102'], dtype='object') 

An alternative is to reset the index_col parameter and just call set_index on the df returned from read_csv , so it becomes single-line:

 In [237]: fra = pd.read_csv(io.StringIO('date,close\n20140101,10.2\n20140102,10.5'), dtype={'date': np.str_, 'close': np.float}).set_index('date') fra.index Out[237]: Index(['20140101', '20140102'], dtype='object') 

Update

This is a bug that is intended for version 0.17.0

+4
source

All Articles