Data.table and error handling with try statement

Question

Data.table and error handling with try statement

I am trying to include a bit of error handling in my R code.

Pseudocode below:

foo = function(X,Y) { ... return(ret.df); } DT = DT[,ret.df := foo(X,Y), by=key(DT)];

The goal is to check if the error foo is caused by some combination of X , Y If it causes an error, I want to skip this combination of records in the final resulting data frame. I tried below without much success:

  DT = DT[ , try(ret.df = : foo(X,y)); if(not (class(ref.df) %in% "try-error') ) { return(ret.df); }, by = key(DT) ];

I can always try and write a wrapper around foo to do error checking, however I am looking for a way to write the syntax directly in the data.table call. Is it possible?

Thanks for your help in advance!

+6

r data.table

Manoj Jan 13 '14 at 5:45

source share

2 answers

There is a function from plyr that you can find here - it completes what Matt did, but in a compressed and reusable form: failwith() .

 library(data.table) library(plyr) foo = function(X,Y) { if (any(Y==2)) stop("Y contains 2!") X*Y } DT = data.table(a=1:3, b=1:6) DT DT[, c := failwith(NA_integer, foo)(a,b), by=a ]

failwith takes two arguments: the value returned on error, and the function to change, f . It returns a new version of f , which, instead of throwing an error, will return the default value.

The definition of failwith pretty simple:

 failwith <- function(default = NULL, f, quiet = FALSE) { f <- match.fun(f) function(...) { try_default(f(...), default, quiet = quiet) } }

+4

hadley Jan 14 '14 at 13:17

source share

Matt dowle · Accepted Answer · 2014-01-13T17:48:17+0000

Here's a dummy function and data:

 foo = function(X,Y) { if (any(Y==2)) stop("Y contains 2!") X*Y } DT = data.table(a=1:3, b=1:6) DT ab 1: 1 1 2: 2 2 3: 3 3 4: 1 4 5: 2 5 6: 3 6

Step by step:

 > DT[, c := foo(a,b), by=a ] Error in foo(a, b) : Y contains 2!

Well, that by design. Good.

In addition, despite the error, column c added.

 > DT abc 1: 1 1 1 2: 2 2 NA 3: 3 3 NA 4: 1 4 4 5: 2 5 NA 6: 3 6 NA

Only the first successful group was filled; he stopped in the second group. This is by design. At some point in the future, we could add transactions to data.table internally, such as SQL, so that if an error occurred, any changes could be discarded. In any case, you just need to know something.

To handle the error, you can use {} .

First try:

 > DT[, c := { if (inherits(try(ans<-foo(a,b)),"try-error")) NA else ans }, by=a] Error in foo(a, b) : Y contains 2! Error in `[.data.table`(DT, , `:=`(c, { : Type of RHS ('logical') must match LHS ('integer'). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (eg by using 1L instead of 1)

The error tells us what to do. Let force type RHS ( NA ) from logical to integer .

 > DT[, c:= { if (inherits(try(ans<-foo(a,b)),"try-error")) NA_integer_ else ans }, by=a] Error in foo(a, b) : Y contains 2!

Better, the long mistake has disappeared. But why else is there a mistake from foo ? See DT for verification only.

 > DT abc 1: 1 1 1 2: 2 2 NA 3: 3 3 9 4: 1 4 4 5: 2 5 NA 6: 3 6 18

Oh, that’s how it worked. The third group is started, and the values 9 and 18 are displayed on lines 3 and 6. The silent ?try shown in the ?try window.

 > DT[, c:= { if (inherits(try(ans<-foo(a,b),silent=TRUE),"try-error")) NA_integer_ else ans }, by=a] > # no errors > DT abc 1: 1 1 1 2: 2 2 NA 3: 3 3 9 4: 1 4 4 5: 2 5 NA 6: 3 6 18

Data.table and error handling with try statement

More articles: