Select rows from a DataFrame based on the presence of a null value in a specific column or columns

I have an imported xls file as pandas dataframe, there are two columns containing the coordinates that I will use to merge the dataframe with others that have geolocation data. df.info () shows 8859 records, the coordinate columns have records "8835 non-null float64".

I want to view 24 rows (which I assume are null), with all the column entries, to see if one of the other columns (street address city) can be used to manually add coordinates for these 24 entries. I.e. return dataframe for the column in df. ['Easting'] where isnull or NaN

I adapted the method given here as shown below:

df.loc[df['Easting'] == NaN]

But return an empty framework (0 rows Γ— 24 columns), which does not make sense (for me). Trying to use Null or Non null does not work because these values ​​are undefined. What am I missing?

+4
source share
1 answer

I think you need isnullto check the values NaNwith boolean indexing:

df[df['Easting'].isnull()]

Docs :

Attention

Keep in mind that in python (and numpy) Narns are not compared, but they do nothing. Note that Pandas / numpy takes advantage of the fact that np.nan! = Np.nan and treats None as np.nan.

In [11]: None == None
Out[11]: True

In [12]: np.nan == np.nan
Out[12]: False

, , None/np.nan .

In [13]: df2['one'] == np.nan
Out[13]: 
a    False
b    False
c    False
d    False
e    False
f    False
g    False
h    False
Name: one, dtype: bool
+5

All Articles