Replace spaces in datasets with r value

I apologize because I thought there would be a very obvious answer, but I can not find anything on the net ...

I often get very large datasets where missing values ​​are empty, for example. (shortly speaking)

#Some description of the dataset #cover x number of lines 31 3213 313 64 63 31 3213 313 64 63 31 3213 313 64 63 31 3213 313 64 63 31 3213 313 64 63 12 178 190 865 532 31 6164 68 614 131 864 808 

I would like to replace all spaces, e.g. -999. If I use a reading table so that

 dat = read.table('file.txt',skip=2) 

I get an error

 Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 6 did not have 5 elements 

I could open the file as a data frame and do

 dat = data.frame('file.txt',skip=2) is.na(rad1) = which(rad1 == '') 

but I don’t know if this will work, because I don’t know how to skip the top 2 lines when reading the data frame (for example, the equivalent of “skip”), and I could not find the answer anywhere. Can anyone help?

Thanks.

+4
source share
1 answer

If you know the width of each column, you can use read.fwf

eg.

 > dat <- read.fwf('temp.txt', skip=2, widths=c(5,5,6,6,6)) > dat V1 V2 V3 V4 V5 1 31 3213 313 64 63 2 31 3213 313 64 63 3 31 3213 313 64 63 4 31 3213 313 64 63 5 31 3213 313 64 63 6 12 178 NA 190 865 7 532 31 6164 68 NA 8 614 NA 131 864 808 

Although it is easy to replace NA values ​​with whatever value you want, it is just a bad idea, because R has many great ways to solve NA values.

For example, to take the average of column two, use:

 mean(dat$V2, na.rm=TRUE) [1] 163.4286 

R has other functions for working with missing data. For example, you can use na.omit() to completely delete rows with missing data.

 > na.omit(dat) V1 V2 V3 V4 V5 1 31 3213 313 64 63 2 31 3213 313 64 63 3 31 3213 313 64 63 4 31 3213 313 64 63 5 31 3213 313 64 63 
+14
source

All Articles