Removing both rows and columns of a partial NA value

I have the following data frame ( s ):

 s<-read.table(text = "V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 1 0 62 64 44 NA 55 81 66 57 53 2 0 0 65 50 NA 56 79 69 52 55 3 0 0 0 57 NA 62 84 76 65 59 4 0 0 0 0 NA 30 70 61 41 36 5 0 0 0 0 NA NA NA NA NA NA 6 0 0 0 0 0 0 66 63 51 44 7 0 0 0 0 0 0 0 80 72 72 8 0 0 0 0 0 0 0 0 68 64 9 0 0 0 0 0 0 0 0 0 47 10 0 0 0 0 0 0 0 0 0 0 ", header = TRUE) 

As you can see, row 5 and column 5 in this case include only the values NA and 0 . I would like to omit them and keep the order of rows and columns. There may be more columns and rows in one template, and I would like to do the same. The data block size can be changed. The end result will be:

  V1 V2 V3 V4 V6 V7 V8 V9 V10 1 0 62 64 44 55 81 66 57 53 2 0 0 65 50 56 79 69 52 55 3 0 0 0 57 62 84 76 65 59 4 0 0 0 0 30 70 61 41 36 6 0 0 0 0 0 66 63 51 44 7 0 0 0 0 0 0 80 72 72 8 0 0 0 0 0 0 0 68 64 9 0 0 0 0 0 0 0 0 47 10 0 0 0 0 0 0 0 0 0 

Is there a way to get the missing row and column number (in this case 5)?

+6
source share
4 answers

We can try

 v1 <- colSums(is.na(s)) v2 <- colSums(s==0, na.rm=TRUE) j1 <- !(v1>0 & (v1+v2)==nrow(s) & v2 >0) v3 <- rowSums(is.na(s)) v4 <- rowSums(s==0, na.rm=TRUE) i1 <- !(v3>0 & (v3+v4)==ncol(s) & v3 >0) s[i1, j1] # V1 V2 V3 V4 V6 V7 V8 V9 V10 #1 0 62 64 44 55 81 66 57 53 #2 0 0 65 50 56 79 69 52 55 #3 0 0 0 57 62 84 76 65 59 #4 0 0 0 0 30 70 61 41 36 #6 0 0 0 0 0 66 63 51 44 #7 0 0 0 0 0 0 80 72 72 #8 0 0 0 0 0 0 0 68 64 #9 0 0 0 0 0 0 0 0 47 #10 0 0 0 0 0 0 0 0 0 

Suppose if we change one of the values ​​in 's'

  s$V7[3] <- NA 

By running the code above, the output will be

 # V1 V2 V3 V4 V6 V7 V8 V9 V10 #1 0 62 64 44 55 81 66 57 53 #2 0 0 65 50 56 79 69 52 55 #3 0 0 0 57 62 NA 76 65 59 #4 0 0 0 0 30 70 61 41 36 #6 0 0 0 0 0 66 63 51 44 #7 0 0 0 0 0 0 80 72 72 #8 0 0 0 0 0 0 0 68 64 #9 0 0 0 0 0 0 0 0 47 #10 0 0 0 0 0 0 0 0 0 

NOTE. The OP condition contains only the values ​​NA and 0. I would like to omit them

+3
source

You need to determine more exactly when you want to opt out. In this case, it looks like a matrix on one side, and the diagonal is always 0.

However, in general, this is what I use

 s[!rowSums(is.na(s))>1,!colSums(is.na(s))>1] 

Given 0

 s[!rowSums(is.na(s)|s==0)>9,!colSums(is.na(s)|s==0)>9] 
+4
source

I was going to offer:

 sclean <- s[rowSums(s == 0|is.na(s)) != ncol(s) | (rowSums(s == 0, na.rm=TRUE) == ncol(s)), colSums(s == 0|is.na(s) )!= nrow(s) | colSums(s == 0, na.rm=TRUE) == nrow(s)] 
+3
source

You can try the following:

 myRowSums <- rowSums(is.na(s) | s == 0) myColSums <- colSums(is.na(s) | s == 0) sSmall <- s[which(myRowSums != ncol(s)), which(myColSums != nrow(s))] 

It works for the next data set to remove all columns and rows that are completely composed of 0 and NA.

 s <- data.frame(a=c(0, rnorm(5), 0), b=c(0, rnorm(2), NA, NA,1, NA), c=c(rep(c(0,NA), 3), 0)) 
+1
source

All Articles