Replacing missing values ​​encoded with .. in an R-data frame

I have a dataframe with missing values ​​encoded with "." And I want to transcode the values ​​as NA:

df <- data.frame("h"=c(1,1,"."))

I try the following:

df$h[df$h == "."] <- NA

But NA displayed as <NA> , and I cannot execute commands like mean(df$h,rm.na=TRUE)

Does anyone know what the problem is? When I recode numbers like NA, no problem

Thanks!

+2
source share
3 answers

Use the is.na function. There is no need to convert to a factor, although the fact that you had character meanings caused you to be numeric.

 > df <- data.frame("h"=c(1,1,".")) > is.na(df) <- df=="." > df h 1 1 2 1 3 <NA> 

I'm not sure why @TylerRinker deleted his answer regarding the use of "na.strings" since I thought this was the correct answer.

Comment: looking at this a year later, I realized that a) the PR incorrectly understood how the values ​​were absent when they were in factors or symbol vectors, and b) that the main problem was not an error when transcoding to the R-missing value, which is the code The OP has already done the right thing, but rather was a bug with the bug that @joran identified.

+6
source

The problem is that your df $ h column is a factor. First try making it a character, and then replace the "." - values:

 df$h <- as.character(df$h) df$h[df$h == "."] <- NA 

Here you see the result:

 df[is.na(df$h),] 

Of course, once you get rid of the points, you can convert them to a numerical variable to calculate with it if you want:

 df$h <- as.numeric(df$h) 
+3
source

Yes, that’s right, this is a factor. convert it to a numeric value first using the syntax below

 df <- transform(df, h=as.numeric(h)) 

and replace with a missing zero

 df$h[is.na(df$h)] <- "0" and then view the data View(df) 
0
source

All Articles