Effectively checks if an arbitrary NaN object in Python / numpy / pandas?

My numpy arrays use np.nan to indicate missing values. Since I iterate over the data set, I need to detect such missing values ​​and handle them in special ways.

Naively, I used numpy.isnan(val) , which works well if val not a subset of the types supported by numpy.isnan() . For example, missing data can occur in string fields, in which case I get:

 >>> np.isnan('some_string') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Not implemented for this type 

Besides writing an expensive wrapper that catches an exception and returns False , is there a way to handle this gracefully and efficiently?

+76
python numpy pandas
Sep 08 '13 at 23:05
source share
2 answers

pandas.isnull() (also pd.isna() , in newer versions) checks for missing values ​​in both numeric and string / object arrays. From the documentation, it checks:

NaN in numeric arrays, None / NaN in arrays of objects

Quick example:

 import pandas as pd import numpy as np s = pd.Series(['apple', np.nan, 'banana']) pd.isnull(s) Out[9]: 0 False 1 True 2 False dtype: bool 

The idea of ​​using numpy.nan to represent missing values ​​is what pandas introduced, so pandas has tools to solve it.

Datetime (if you use pd.NaT you do not need to specify dtype)

 In [24]: s = Series([Timestamp('20130101'),np.nan,Timestamp('20130102 9:30')],dtype='M8[ns]') In [25]: s Out[25]: 0 2013-01-01 00:00:00 1 NaT 2 2013-01-02 09:30:00 dtype: datetime64[ns]'' In [26]: pd.isnull(s) Out[26]: 0 False 1 True 2 False dtype: bool 
+130
Sep 08 '13 at 23:33
source share

Is your type really arbitrary? If you know this is just an int float or string, you could just do

  if val.dtype == float and np.isnan(val): 

Assuming it is wrapped in numpy, it will always have dtype, and only float and complex can be NaN

+14
Sep 08 '13 at 23:15
source share



All Articles