Here is the MWE.
dta <- data.table(id=rep(1:2, each=5), seq=rep(1:5, 2), val=1:10) dtb <- data.table(id=c(1, 1, 2, 2), fil=c(2, 3, 3, 4)) dtc <- data.table(id=c(1, 1, 2, 2), mval=rep(0, 4)) for (ind in 1:4) dtc$mval[ind] <- mean( dta$val [dta$id == dtb$id[ind] & dta$seq < dtb$fil[ind]] ) dtc
dtc should have the same number of lines as dtb. For each (row) ind in dtc,
dtc$id[ind] = dtb$id[ind] .dtc$mval[ind] = mean(dta$val[x]) , where x is dta$id == dtb$id[ind] & dta$seq < dtb$fil[ind] .
My data.tables are extremely large. Therefore, I am looking for a way to achieve the above with a minimum amount of memory. I was thinking about joining without equi and then about summing, but I can't get it to work. Hence the title of the question.
Thanks so much for any help, thanks!
source share