Fill a column using if statements in r

I have a pretty simple question I'm struggling with right now. If I have an example dataframe:

a <- c(1:5) b <- c(1,3,5,9,11) df1 <- data.frame(a,b) 

How to create a new column ('c'), which is then populated using if statements in column b. For example: "cat" for those values ​​in b that are equal to 1 or 2 "dog" for those values ​​in b that are between 3 and 5 "rabbit" for those values ​​in b that are greater than 6

So column "c" using dataframe df1 will read: cat, dog, dog, rabbit, rabbit.

Thank you very much in advance.

+4
source share
3 answers
 dfrm$dc <- c("dog", "cat", "rabbit")[ findInterval(dfrm$b, c(1, 2.5, 5.5, Inf)) ] 

The findInterval approach will be much faster than the nested ifelse strategies, and I assume this is much faster than the function that iterates over unnecessary if . Those of us who work with big data notice the differences when we choose inefficient algorithms.

This did not actually address the request, but I do not always think that new R users will know the most expressive or effective approach to problems. The request to β€œuse IF” sounded like an attempt to translate coding approaches specific to the two main macro-statistical processors SPSS and SAS. The R if control structure is usually not an effective approach to recoding a column, since an argument to its first position will be evaluated only for the first element. By itself, it does not process the column, while the ifelse function does this. The cut function could be used here (with the appropriate breaks and labels parameters), although it would put a factor value instead of a character value. The findInterval approach was chosen for its ability to return multiple levels (which cannot be one ifelse ). I think ifelse chaining or nesting gets quickly ugly and confusing after about two or three levels of nesting.

+6
source
 df1 <- transform( df1 , c = ifelse( b %in% 1:2 , 'cat' , ifelse( b %in% 3:5 , 'dog' , 'rabbit' ) ) ) 
+2
source

Although ifelse () is useful, sometimes it does not provide what it would intuitively expect. Therefore, I like to write.

 a <- c(1:5) b <- c(1,3,5,9,11) df1 <- data.frame(a,b) species <- function(x) { if(x == 1 | x == 2) y <- "cat" if(x > 2 & x < 6) y <- "dog" if(x > 6) y <- "rabbit" return(y) } df1$c <- sapply(df1$b,species) 
+1
source

All Articles