Calculate the conditional current amount in R for each row in the data frame

I would like to create a column equal to the current sum of the $ Rating data, considering that two conditions are true in columns 3 and 4, in particular, that the $ Year <current year and $ ID data are equal to the current ID.

In words, this should calculate the total amount of ratings for each identifier up to the previous year. And he has to do this for each row in the data frame (about 50,000 rows). Given the size of the data frame, I would prefer not to loop, if at all possible.

I gave a short example of how it will look below ...

> head(data[,c(3,4,13)]) Year ID Rating CumSum 1 2010 13578 2 0 2 2010 13579 1 0 3 2010 13575 3 0 4 2011 13575 4 3 5 2012 13578 3 2 6 2012 13579 2 1 7 2012 13579 4 1 

I come from the table, so I still think in terms of SUMIFS, etc. (that would perfectly solve my problem in Excel), so I apologize if the language is not accurate.

+1
r
source share
1 answer
 data <- data.frame(Year = c( rep(2010, 3), 2011, rep(2012, 3) ), ID = c(13578, 13579, 13575, 13575, 13578, 13579, 13579), Rating = c(2, 1, 3, 4, 3, 2, 4)) data # Year ID Rating # 1 2010 13578 2 # 2 2010 13579 1 # 3 2010 13575 3 # 4 2011 13575 4 # 5 2012 13578 3 # 6 2012 13579 2 # 7 2012 13579 4 
  • Create a column equal to the current amount of data$Rating , where
    • data$Year < Year
    • data$ID == ID
  • This should calculate the total amount of ratings for each ID up to the previous year.

The desired result will be

 data # Year ID Rating CumSum # 1 2010 13578 2 2 # 2 2010 13579 1 1 # 3 2010 13575 3 3 # 4 2011 13575 4 7 # 5 2012 13578 3 5 # 6 2012 13579 2 3 # 7 2012 13579 4 7 

This can be done like this:

 year <- 2014 # maximum year to include in cumsum ID.values <- names(table(data$ID)) # get unique values of data$ID, sorted # cumsum for 13575 rows, followed by cumsum for 13578 rows, ... Rating.cumsum <- unlist(sapply(ID.values, function(x) cumsum(data$Rating[data$ID == x]))) # assign cumsum output to appropriate rows data$cumsum[with(data, order(ID))] <- Rating.cumsum 
+1
source share

All Articles