I am trying to use data.table inside a function and I am trying to understand why my code is not working. I have data.table as follows:
DT <- data.table(my_name=c("A","B","C","D","E","F"),my_id=c(2,2,3,3,4,4)) > DT my_name my_id 1: A 2 2: B 2 3: C 3 4: D 3 5: E 4 6: F 4
I am trying to create all pairs of "my_name" with different values โโof "my_id", which for DT will be:
Var1 Var2 AC AD AE AF BC BD BE BF CE CF DE DF
I have a function to return all pairs of "my_name" for a given pair of "my_id" values, which works as expected.
get_pairs <- function(id1,id2,tdt) { return(expand.grid(tdt[my_id==id1,my_name],tdt[my_id==id2,my_name])) } > get_pairs(2,3,DT) Var1 Var2 1 AC 2 BC 3 AD 4 BD
Now I want to execute this function for all pairs of identifiers that I am trying to do by finding all pairs of identifiers and then using mapply with the get_pairs function.
> combn(unique(DT$my_id),2) [,1] [,2] [,3] [1,] 2 2 3 [2,] 3 4 4 tid1 <- combn(unique(DT$my_id),2)[1,] tid2 <- combn(unique(DT$my_id),2)[2,] mapply(get_pairs, tid1, tid2, DT) Error in expand.grid(tdt[my_id == id1, my_name], tdt[my_id == id2, my_name]) : object 'my_id' not found
Again, if I try to do the same without mapply, it works.
get_pairs3(tid1[1],tid2[1],DT) Var1 Var2 1 AC 2 BC 3 AD 4 BD
Why does this function only work when used in mapply? I think this has something to do with the namespace data.table, but I'm not sure.
Alternatively, is there another / more efficient way to accomplish this task? I have a big data.table with the third identifier "sample", and I need to get all these pairs for each sample (for example, work with DT [sample == "sample_id",]). I am new to data.table package and I cannot use it in the most efficient way.