Fill a column using if statements in r

Question

Fill a column using if statements in r

I have a pretty simple question I'm struggling with right now. If I have an example dataframe:

a <- c(1:5) b <- c(1,3,5,9,11) df1 <- data.frame(a,b)

How to create a new column ('c'), which is then populated using if statements in column b. For example: "cat" for those values in b that are equal to 1 or 2 "dog" for those values in b that are between 3 and 5 "rabbit" for those values in b that are greater than 6

So column "c" using dataframe df1 will read: cat, dog, dog, rabbit, rabbit.

Thank you very much in advance.

+4

r

KT_1 Dec 02 '12 at 19:17

source share

3 answers

42- · Answer 1 · 2012-12-02T21:54:35+0000

 dfrm$dc <- c("dog", "cat", "rabbit")[ findInterval(dfrm$b, c(1, 2.5, 5.5, Inf)) ]

The findInterval approach will be much faster than the nested ifelse strategies, and I assume this is much faster than the function that iterates over unnecessary if . Those of us who work with big data notice the differences when we choose inefficient algorithms.

This did not actually address the request, but I do not always think that new R users will know the most expressive or effective approach to problems. The request to “use IF” sounded like an attempt to translate coding approaches specific to the two main macro-statistical processors SPSS and SAS. The R if control structure is usually not an effective approach to recoding a column, since an argument to its first position will be evaluated only for the first element. By itself, it does not process the column, while the ifelse function does this. The cut function could be used here (with the appropriate breaks and labels parameters), although it would put a factor value instead of a character value. The findInterval approach was chosen for its ability to return multiple levels (which cannot be one ifelse ). I think ifelse chaining or nesting gets quickly ugly and confusing after about two or three levels of nesting.

Anthony damico · Answer 2 · 2012-12-02T19:27:34+0000

 df1 <- transform( df1 , c = ifelse( b %in% 1:2 , 'cat' , ifelse( b %in% 3:5 , 'dog' , 'rabbit' ) ) )

Brandon bertelsen · Answer 3 · 2012-12-02T19:32:15+0000

Although ifelse () is useful, sometimes it does not provide what it would intuitively expect. Therefore, I like to write.

 a <- c(1:5) b <- c(1,3,5,9,11) df1 <- data.frame(a,b) species <- function(x) { if(x == 1 | x == 2) y <- "cat" if(x > 2 & x < 6) y <- "dog" if(x > 6) y <- "rabbit" return(y) } df1$c <- sapply(df1$b,species)

Fill a column using if statements in r

More articles: