Remove multiple columns from data.table

What is the correct way to delete multiple columns from data.table? I am currently using the code below, but was getting unexpected behavior when I accidentally repeated one of the column names. I was not sure if this was a mistake, or if I should not delete the columns this way.

library(data.table) DT <- data.table(x = letters, y = letters, z = letters) DT[ ,c("x","y") := NULL] names(DT) [1] "z" 

The above works fine, but

 DT <- data.table(x = letters, y = letters, z = letters) DT[ ,c("x","x") := NULL] names(DT) [1] "z" 
+57
r data.table
May 19 '13 at 19:16
source share
2 answers

This seems like a solid, reproducible error. It was registered as Error No. 2791 .

It seems that repeating the column is trying to remove subsequent columns.
If there are no columns left, R fails.




UPDATE : now fixed in v1.8.11. From NEWS :

Assigning the same column twice in the same query is now an error, not a failure in some circumstances; for example, DT [, c ("B", "B"): = NULL] (delete the same column twice by reference). Thanks to Ricardo ( # 2751 ) and matt_k ( # 2791 ) for reporting. Added tests.

+32
May 19 '13 at 20:07
source share

This Q has been answered, but consider this as a note.

I prefer the following syntax to delete multiple columns

 DT[ ,`:=`(x = NULL, y = NULL)] 

because it is the same as adding multiple columns (variables)

 DT[ ,`:=`(x = letters, y = "Male")] 

It also checks for duplicate column names. Therefore, attempting to reset x twice will cause an error message.

+10
Oct 02 '15 at 19:42
source share



All Articles