Assign a value to a group depending on the condition in the column

I have a data frame that looks like this:

> df = data.frame(group = c(1,1,1,2,2,2,3,3,3), date = c(1,2,3,4,5,6,7,8,9), value = c(3,4,3,4,5,6,6,4,9)) > df group date value 1 1 1 3 2 1 2 4 3 1 3 3 4 2 4 4 5 2 5 5 6 2 6 6 7 3 7 6 8 3 8 4 9 3 9 9 

I want to create a new column that contains a date value for each group that is associated with the value "4" from the value column.

The next data frame shows what I hope to accomplish.

  group date value newValue 1 1 1 3 2 2 1 2 4 2 3 1 3 3 2 4 2 4 4 4 5 2 5 5 4 6 2 6 6 4 7 3 7 6 8 8 3 8 4 8 9 3 9 9 8 

As we can see, group 1 has newValue "2" because it is the date associated with the value "4". Similarly, the second group has newValue 4, and in the third group has newValue 8.

I assume there is an easy way to do this with ave () or a series of dplyr / data.table functions, but I have not been successful in my many attempts.

+6
source share
3 answers

Here's a quick data.table one

 library(data.table) setDT(df)[, newValue := date[value == 4L], by = group] df # group date value newValue # 1: 1 1 3 2 # 2: 1 2 4 2 # 3: 1 3 3 2 # 4: 2 4 4 4 # 5: 2 5 5 4 # 6: 2 6 6 4 # 7: 3 7 6 8 # 8: 3 8 4 8 # 9: 3 9 9 8 

Here is a similar version of dplyr

 library(dplyr) df %>% group_by(group) %>% mutate(newValue = date[value == 4L]) 

Or a possible basic R solution using merge after filtering data (some renaming required)

 merge(df, df[df$value == 4, c("group", "date")], by = "group") 
+11
source

Here is the basic option R

  df$newValue = rep(df$date[which(df$value == 4)], table(df$group)) 

Another alternative using lapply

 do.call(rbind, lapply(split(df, df$group), function(x){x$newValue = rep(x$date[which(x$value == 4)], each = length(x$group)); x})) # group date value newValue #1.1 1 1 3 2 #1.2 1 2 4 2 #1.3 1 3 3 2 #2.4 2 4 4 4 #2.5 2 5 5 4 #2.6 2 6 6 4 #3.7 3 7 6 8 #3.8 3 8 4 8 #3.9 3 9 9 8 
+1
source

Another base R path:

 df$newValue <- ave(`names<-`(df$value==4,df$date), df$group, FUN=function(x) as.numeric(names(x)[x])) df group date value newValue 1 1 1 3 2 2 1 2 4 2 3 1 3 3 2 4 2 4 4 4 5 2 5 5 4 6 2 6 6 4 7 3 7 6 8 8 3 8 4 8 9 3 9 9 8 10 3 11 7 8 

I used the test for variable length groups. I assigned the date column as the names for the logical index value equal to 4. Then specify the value by group.

Data

 df = data.frame(group = c(1,1,1,2,2,2,3,3,3,3), date = c(1,2,3,4,5,6,7,8,9,11), value = c(3,4,3,4,5,6,6,4,9,7)) 
+1
source

All Articles