The average value of n rows

I have a dataframe with three columns: Id , Date and Value and I want to reduce it by the average value: take the next 20 rows, build the average Value from these 20 rows and add it to a new data block with the same structure. Date should be the first value of 20 lines.

I tried it like this (maybe awful :):

 resample.downsample <- function(data, by=20) { i <- 0 nmax <- nrow(data) means <- c() while(i < nmax) { means <- c(means, mean(subset(data, Id > i & Id <= i+by)$Value)) i <- i+by } return ( data.frame( Id = seq(1, length.out=(nmax/by), by=1), Date = seq(startDate, length.out=(nmax/by), by=(1/by)), Value = means ) ) } 

This works for small data sets, but works forever on my real data sets (~ 4,000,000 rows). Any ideas on optimizing this feature?

Sample-Data (input, output should have the same structure, classes: integer, numeric, POSIXct / POSIXt):

  Value Id Date 1 125 1 2011-06-30 22:41:50 2 127 2 2011-06-30 22:41:50 3 126 3 2011-06-30 22:41:50 4 123 4 2011-06-30 22:41:50 5 130 5 2011-06-30 22:41:50 6 131 6 2011-06-30 22:41:50 7 128 7 2011-06-30 22:41:50 
+4
source share
1 answer

See this answer for a method that should work for you. How to get the sum of each of the four rows of a matrix in R. In your case, it will be:

 colMeans(matrix(data$Value, nrow=20)) 

Your current method for getting the first date should be fine.

+4
source

All Articles