Use Cook Distance
You could use the distance from cooking. The exposure time is calculated based on the linear regression model. This means that you can include several X variables to calculate outlier (observations with high influence, more precisely). This effectively gives you the ability to add or omit the variables by which you want to determine deviations. A way to compute it for each observation in R would look something like this:
mod <- lm(Y ~ X1 + X2 + X3, data=inputData) cooksd <- cooks.distance(mod)
In general, these observations with a cooking distance> 4 * mean (cook distance) are considered outliers. For more information on the formula and interpretation of the roll distance, see this example.
Disclaimer: I am the author.
Selva
source share