How can I refer to a column that is not part of the SD?

I have a column in my data.table that contains the data that I would like to use to update a bunch of other columns. This data is a list, and I need to multiply the list based on the value in each column, which I will include in my SD expression

My details ....

 dt <- data.table( A = list( c("X","Y") , c("J","K") ) , B = c(1,2) , C = c(2,1) ) # ABC #1: X,Y 1 2 #2: J,K 2 1 

My desired result ....

 # ABC #1: X,YXY #2: J,KKJ 

What I tried ....

 # Column A is not included in SD so not found... dt[ , lapply( .SD , function(x) A[x] ) , .SDcols = 2:3 ] #Error in FUN(X[[1L]], ...) : object 'A' not found # This also does not work. See all of A as one long vector (look at results for C) for( i in 2:3 ) dt[ , names(dt)[i] := unlist(A)[ get(names(dt)[i]) ] ] # ABC #1: X,YXY #2: J,KYX # I saw this in another answer, but also won't work: # Basically we add an ID column and use 'by=' to try and solve the problem above # Now we get a type mismatch dt <- data.table( ID = 1:2 , A = list( c("X","Y") , c("J","K") ) , B = c(1,2) , C = c(2,1) , key = "ID" ) for( i in 3:4 ) dt[ , names(dt)[i] := unlist(A)[ get(names(dt)[i]) ] , by = ID ] #Error in `[.data.table`(dt, , `:=`(names(dt)[i], unlist(A)[get(names(dt)[i])]), : # Type of RHS ('character') must match LHS ('double'). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (eg by using 1L instead of 1) 

If anyone is interested, my real data is a set of SNP and INDELS for different isolates, and I'm trying to do this:

 # My real data looks more like this: # In columns V10:V15; # if '.' in first character then use data from 'Ref' column # else use integer at first character to subset list in 'Alt' column # Contig Pos V3 Ref Alt Qual V10 V11 V12 V13 V14 V15 #1: 1 172 . TC 81.0000 1/1:.:.:. ./.:.:.:. ./.:.:.:. ./.:.:.:. ./.:.:.:. ./.:.:.:. #2: 1 399 . GC,A 51.0000 ./.:.:.:. 1/1:.:.:. 2/2:.:.:. ./.:.:.:. 1/1:.:.:. ./.:.:.:. #3: 1 516 . TG 57.0000 ./.:.:.:. 1/1:.:.:. ./.:.:.:. 1/1:.:.:. ./.:.:.:. ./.:.:.:. 
+7
r data.table
source share
3 answers

You can use mapply and set with a for loop. There may be more effective ways.

 for(j in c('B','C')){ set(dt, j = j, value = mapply(FUN = '[', dt[['A']],dt[[j]])) } dt # ABC # 1: X,YXY # 2: J,KKJ 
+4
source share

Hi, does this work for you?

 dt$B <- apply(dt, 1, FUN = function(x) x$A[x$B]) dt$C <- apply(dt, 1, FUN = function(x) x$A[x$C]) dt # ABC #1: X,YXY #2: J,KKJ 
+1
source share

Probably a more elegant way to do this, and it doesn't scale well, but here it goes ...

 dt[,A1:=lapply(A,'[[',1)] dt[,A2:=lapply(A,'[[',2)] dt[B==1,`:=`(Bnew=A1,Cnew=A2)] dt[B==2,`:=`(Bnew=A2,Cnew=A1)] dt[,`:=`(A1=NULL,A2=NULL,B=NULL,C=NULL)] setnames(dt,c("Bnew","Cnew"),c("B","C")) 
0
source share

All Articles