What is the best way to collapse two factors with NA into one variable

I have many sets of variables like this:

Var1 Var2 "Asian" NA NA "Black" "White" NA 

I would like to conveniently get them in this form:

  Race "Asian" "Black" "White" 

I am trying something like:

 Race <- ifelse(is.na(Var1), Var2, Var1) 

But this converts the values ​​into numbers for the levels, and the numbers do not match (for example, giving 1, 1, 2 ). Is there a convenient way to do this (ideally with short, clear code)? (You can get out of this with as.character , but there should be a better way.)

+5
source share
3 answers

With intermediate conversion via as.character :
Assuming this is your data:

 dat <- data.frame(Var1=c("Asian",NA,"White"),Var2=c(NA,"Black",NA)) do.call(pmax,c(lapply(dat,as.character),na.rm=TRUE)) #[1] "Asian" "Black" "White" 

If you need to work with a specific subset, you can do:

 do.call(pmax,c(lapply(dat[c("Var1","Var2")],as.character),na.rm=TRUE)) 

An alternative not requiring as.character would be:

 dat[cbind(1:nrow(dat),max.col(!is.na(dat)))] #[1] "Asian" "Black" "White" 
+4
source

How about this solution?

 ind <- apply(df, 1, function(x) which(!is.na(x))) df[cbind(seq_along(ind), ind)] [1] "Asian" "Black" "White" 
+2
source

Another solution (rather strange, I agree and quite briefly, your columns should be characters, as it seems in your example):

 > library(tidyr) > unite(replace(df, is.na(df), ""), V, c(Var1, Var2), sep=''))$V #[1] "Asian" "Black" "White" 

Or it may be risky to use gsub, but here NA is an integral part of the character string:

 > gsub("NA", "", unite(df, V, c(Var1, Var2), sep='')$V) #[1] "Asian" "Black" "White" 
+1
source

Source: https://habr.com/ru/post/1210876/


All Articles