In connection with this question here , but I decided to ask another question for clarity, since the "new" question is not directly related to the original. In short, I use ddply to summarize the amount for each of the three years. My code takes data from the first year and repeats in the second and third rows of the column. I assume that every 1 year piece is copied to the entire column, but I donโt understand why.
Q. How can I get a cumulatively summed value for each year in the correct rows of a specified column?
[Edit: a for loop - or something similar - is important, because in the end I want to automatically calculate new columns based on a list of column names, rather than manually compute each new column. The loop repeats in the list of column names.]
I often use a combination of ddply and cumsum, so it's pretty annoying to suddenly have problems with it.
[Edit: this code has been updated to the solution I settled on, based on @Chase's answer below]
require(lubridate) require(plyr) require(xts) require(reshape) require(reshape2) set.seed(12345) # create dummy time series data monthsback <- 24 startdate <- as.Date(paste(year(now()),month(now()),"1",sep = "-")) - months(monthsback) mydf <- data.frame(mydate = seq(as.Date(startdate), by = "month", length.out = monthsback), myvalue1 = runif(monthsback, min = 600, max = 800), myvalue2 = runif(monthsback, min = 1900, max = 2400), myvalue3 = runif(monthsback, min = 50, max = 80), myvalue4 = runif(monthsback, min = 200, max = 300)) mydf$year <- as.numeric(format(as.Date(mydf$mydate), format="%Y")) mydf$month <- as.numeric(format(as.Date(mydf$mydate), format="%m")) # Select columns to process newcolnames <- c('myvalue1','myvalue4','myvalue2') # melt n' cast mydf.m <- mydf[,c('mydate','year',newcolnames)] mydf.m <- melt(mydf.m, measure.vars = newcolnames) mydf.m <- ddply(mydf.m, c("year", "variable"), transform, newcol = cumsum(value)) mydf.m <- dcast(mydate ~ variable, data = mydf.m, value.var = "newcol") colnames(mydf.m) <- c('mydate',paste(newcolnames, "_cum", sep = "")) mydf <- merge(mydf, mydf.m, by = 'mydate', all = FALSE) mydf