Transposed vector by group in data.table

What is the idiomatic data.table method for creating a data table with separate columns for the elements of a vector returned by a function computed by a group?

Consider the data table:

library(data.table) data(iris) setDT(iris) 

If the function is range() , I need a result similar to:

 iris[, .(min_petal_width = min(Petal.Width), max_petal_width = max(Petal.Width) ), keyby = Species] # produces desired output 

but using the range() function.

I can use dcast , but this is ugly:

 dcast( iris[, .( petal_width = range(Petal.Width), value = c("min_petal_width", "max_petal_width")), keyby = Species], Species ~ value, value.var = "petal_width") 

I hope there will be a simpler expression:

 iris[, (c("min_petal_width","max_petal_width")) = range(Petal.Width), keyby = Species] # doesn't work 
+6
source share
3 answers

Your approach was very close. Just remember that you need to submit the list to data.table, and he will gladly accept it. Therefore, you can use:

 iris[, c("min_petal_width","max_petal_width") := as.list(range(Petal.Width)), by = Species] 

I misunderstood the question. Since you want to aggregate the result instead of adding new columns, you can use

 cols <- c("min_petal_width", "max_petal_width") iris[, setNames(as.list(range(Petal.Width)), cols), keyby = Species] 

But I'm sure there are other approaches to data.table.

+5
source

You can also do:

 dt[, lapply(list(min=min, max=max), function(f) f(Petal.Width)), by=Species] # Species min max # 1: setosa 0.1 0.6 # 2: versicolor 1.0 1.8 # 3: virginica 1.4 2.5 
+6
source

If readability and brevity are really important to you, I would define a custom function or a binary operator that can then be easily used in your expression of a subset of data.table, for example.

 # custom function .nm <- function(v,vnames){ `names<-`(as.list(v),vnames) } # custom binary operator `%=%` <- function(vnames,v){ `names<-`(as.list(v),vnames) } # using custom function iris[, .nm(range(Petal.Width),c("min_petal_width", "max_petal_width")), keyby = Species] # using custom binary operator iris[, c("min_petal_width", "max_petal_width") %=% range(Petal.Width), keyby = Species] 
+2
source

All Articles