Data.table `: =` assignment expressions with dynamic inputs (existing columns) and outputs (new column names)

Note. The exact problem I encountered in this question does not apply to the latest versions of the data table. If you want to do something like the one described in the header, check the corresponding question in the FAQ package, 1.6 OK, but I do not know the expressions in advance. How to transfer it programmatically? .

I saw an answer that illustrates how to build an expression to evaluate in

DT[,j=eval(expr)] 

I use this with the assignment, `` `: =` (mycol = my_calculation) ``, and I wonder ...

  • How can I name "mycol" dynamically?
  • What is the correct way to let my_calculation take a dynamically defined set of columns?

By “dynamically” I mean “defined after I write the code for my expr ”.

New example

EDIT: To better illustrate the problem, here is another example. Look in the history of changes to see the original.

 require(data.table) require(plyr) options(datatable.verbose=TRUE) DT <- CJ(a=0:1,b=0:1,y=2) # setup: expr <- as.quoted(paste(expression(get(col_in_one)+get(col_in_two))))[[1]] # usage: col_in_one <- 'a' col_in_two <- 'b' col_out <- 'bah' DT[,(col_out):=eval(expr)] # fails, should take the form j=eval(expr) 

I want the separation of settings and usage to continue , so my code is easier to maintain. My real expression is more erratic than this example (where it just selects a single column).

Questions

First question: how can I make the assigned col_out column dynamic? I mean: I want to specify cols_in_ * and col_out on the fly.

I tried to create various expressions in "expr", but as.quoted throws an error not to put some elements to the left of the = symbol.

Second question: how can I avoid warnings about using get ?

Warnings suggest using .SDcols to [.data.table know which columns I'm using. However, if I use the .SDcols argument, another warning says that it makes no sense to do this if .SD is not used.

Preliminary decision

The solutions I have so far ...

 # Ricardo + eddi: expr2 <- as.quoted(paste(expression(`:=`( Vtmp=.SD[[col_in_one]]+.SD[[col_in_two]]))))[[1]] # usage col_in_one <- 'a' col_in_two <- 'b' col_out <- 'bah' DT[,eval(expr2),.SDcols=c(col_in_one,col_in_two)] setnames(DT,'Vtmp',col_out) 

This is still due to a slight annoyance in the two-step operation and tracking of "Vtmp", so the first question is still partially open.

+8
r data.table
source share
2 answers

I may not understand the problem well, but this is enough:

 DT[, (col_out) := .SD[[col_in_one]]+.SD[[col_in_two]], .SDcols = c(col_in_one,col_in_two)] DT # aby bah #1: 0 0 2 0 #2: 0 1 2 1 #3: 1 0 2 1 #4: 1 1 2 2 

To answer the edited question to get eval to work, use .SD as an environment:

 DT[, (col_out) := eval(expr, .SD)] 

Also see this question and update there - eval and quote in data.table

+8
source share

The easiest way is to set it AFTER you evaluate the expression. After all, a runtime that is constant and nearly 0.

 someDummyVar <- "tempColName_XCWF5D" DT [, (someDummyVar) := eval(expr) ] setnames(DT, someDummyVar, RealColumnName) 

Regarding the second question: do not include detailed warnings, and you will not receive detailed warnings;)

 options(datatable.verbose=FALSE) 

As for Reduce : try posting this as a separate and simplified question so that it is easier to keep track of what you are doing (outside of eval problems)

+5
source share

All Articles