In Pandas, how do I use fillna to fill all columns with a row if the column is empty initially?

My table:

In [15]: csv=u"""a,a,,a ....: b,b,,b ....: c,c,,c ....: """ In [18]: df = pd.read_csv(io.StringIO(csv), header=None) 

Fill in the blank columns as "UNKNOWN"

 In [19]: df Out[19]: 0 1 2 3 0 aa NaN a 1 bb NaN b 2 cc NaN c In [20]: df.fillna({2:'UNKNOWN'}) 

Error received

 ValueError: could not convert string to float: UNKNOWN 
+8
python pandas
source share
2 answers

Column 2 probably has a float type:

 >>> df 0 1 2 3 0 aa NaN a 1 bb NaN b 2 cc NaN c >>> df.dtypes 0 object 1 object 2 float64 3 object dtype: object 

Hence the problem. If you don't mind converting the entire frame to object , you can:

 >>> df.astype(object).fillna("UNKNOWN") 0 1 2 3 0 aa UNKNOWN a 1 bb UNKNOWN b 2 cc UNKNOWN c 

Depending on whether there is non-row data, you may be more selective about converting the column types of the column and / or specifying dtypes when reading, but the above should work anyway.


Update: if you have dtype information that you want to save, instead of switching it back, I would go the other way and fill only those columns that you wanted, or using a loop with fillna :

 >>> df 0 1 2 3 4 5 0 0 aa NaN a NaN 1 1 bb NaN b NaN 2 2 cc NaN c NaN >>> df.dtypes 0 int64 1 object 2 object 3 float64 4 object 5 float64 dtype: object >>> for col in df.columns[pd.isnull(df).all()]: ... df[col] = df[col].astype(object).fillna("UNKNOWN") ... >>> df 0 1 2 3 4 5 0 0 aa UNKNOWN a UNKNOWN 1 1 bb UNKNOWN b UNKNOWN 2 2 cc UNKNOWN c UNKNOWN >>> df.dtypes 0 int64 1 object 2 object 3 object 4 object 5 object dtype: object 

Or (if you use all ), you might not even use fillna :

 >>> df 0 1 2 3 4 5 0 0 aa NaN a NaN 1 1 bb NaN b NaN 2 2 cc NaN c NaN >>> df.ix[:,pd.isnull(df).all()] = "UNKNOWN" >>> df 0 1 2 3 4 5 0 0 aa UNKNOWN a UNKNOWN 1 1 bb UNKNOWN b UNKNOWN 2 2 cc UNKNOWN c UNKNOWN 
+7
source share

As a workaround, just set the column directly, the fillna transform add-in should work and is a bug

 In [8]: df = pd.read_csv(io.StringIO(csv), header=None) In [9]: df Out[9]: 0 1 2 3 0 aa NaN a 1 bb NaN b 2 cc NaN c In [10]: df.loc[:,2] = 'foo' In [11]: df Out[11]: 0 1 2 3 0 aa foo a 1 bb foo b 2 cc foo c In [12]: df.dtypes Out[12]: 0 object 1 object 2 object 3 object dtype: object 
+4
source share

All Articles