Nonequilibrium pooling, then summing over groups

Question

Nonequilibrium pooling, then summing over groups

Here is the MWE.

dta <- data.table(id=rep(1:2, each=5), seq=rep(1:5, 2), val=1:10) dtb <- data.table(id=c(1, 1, 2, 2), fil=c(2, 3, 3, 4)) dtc <- data.table(id=c(1, 1, 2, 2), mval=rep(0, 4)) for (ind in 1:4) dtc$mval[ind] <- mean( dta$val [dta$id == dtb$id[ind] & dta$seq < dtb$fil[ind]] ) dtc # id mval # 1: 1 1.0 # 2: 1 1.5 # 3: 2 6.5 # 4: 2 7.0

dtc should have the same number of lines as dtb. For each (row) ind in dtc,

dtc$id[ind] = dtb$id[ind] .
dtc$mval[ind] = mean(dta$val[x]) , where x is dta$id == dtb$id[ind] & dta$seq < dtb$fil[ind] .

My data.tables are extremely large. Therefore, I am looking for a way to achieve the above with a minimum amount of memory. I was thinking about joining without equi and then about summing, but I can't get it to work. Hence the title of the question.

Thanks so much for any help, thanks!

+5

r data.table

Anirban mukherjee 18 sept. '16 at 7:54

source share

1 answer

akrun · Accepted Answer · 2016-09-18T08:34:31+0000

Maybe it helps

 dtc[, mval := dta[dtb, mean(val) ,on =.(id, seq < fil), by = .EACHI]$V1] dtc # id mval #1: 1 1.0 #2: 1 1.5 #3: 2 6.5 #4: 2 7.0

Nonequilibrium pooling, then summing over groups

More articles: