One-step fusible melt / dcast

Question

One-step fusible melt / dcast

I have the following data.table

 library(data.table) testdt <- data.table(var1=rep(c("a", "b"), e=3), p1=1:6, p2=11:16) # var1 p1 p2 #1: a 1 11 #2: a 2 12 #3: a 3 13 #4: b 4 14 #5: b 5 15 #6: b 6 16

I need to have an average value for each var1 for each p* , p* should be in the rows and different unique values of var1 in the columns.
So, I am looking for this output:

  variable ab 1 p1 2 5 2 p2 12 15

The easiest way to find this:

 dcast(melt(testdt, id.vars = "var1", measure.vars = c("p1", "p2")), variable ~ var1, value.var = "value", fun.aggregate = median)

But I have a feeling that I missed something here (as the most suitable function), so I would like to know how directly (a unique function) to do the same.

I know that the recast package can do the trick with recast(testdt, variable~var1, fun=median, id.var="var1") , but I would like to avoid loading another package.

Edit:

I am looking for a solution, both simple and effective. This will apply to a list of ~ 40 tables with ~ 300 columns and ~ 80 rows

+7

r data.table

Cath Jan 22 '16 at 10:28

source share

1 answer

manotheshark · Answer 1 · 2016-12-21T05:28:01+0000

If speed is a priority, there is a slight speedup of about 23% (albeit in milliseconds) if you first calculate median . This delta is also likely to increase as your dataset grows, as there will be less data to melt.

 library(data.table) dcast(melt(testdt[, lapply(.SD, median), by=var1], id.vars="var1"), variable ~ var1))

test

 Unit: milliseconds expr min lq mean median uq max neval fun.aggregate = median 4.221654 4.453063 4.87418 4.510775 4.579718 35.28569 1000 lapply(.SD, median) 3.196289 3.410711 3.77483 3.461073 3.523096 22.78637 1000

One-step fusible melt / dcast

More articles: