Forecast with lme4 at new levels

I am trying to fit a mixed effects model, and then use this model to generate estimates of a new dataset, which can have different levels. I expected that the estimates of the new dataset will use the average value of the estimated parameters, but this does not seem to be the case. Here's a minimal working example:

library(lme4) d = data.frame(x = rep(1:10, times = 3), y = NA, grp = rep(1:3, each = 10)) d$y[d$grp == 1] = 1:10 + rnorm(10) d$y[d$grp == 2] = 1:10 * 1.5 + rnorm(10) d$y[d$grp == 3] = 1:10 * 0.5 + rnorm(10) fit = lmer(y ~ (1+x)|grp, data = d) newdata = data.frame(x = 1:10, grp = 4) predict(fit, newdata = newdata, allow.new.levels = TRUE) 

In this example, I essentially define three groups with different regression equations (slopes 1, 1.5, and 0.5). However, when I try to predict a new dataset with an invisible level, I get a constant estimate. I expected the expected slope and intercept value to be used to generate forecasts for this new data. I expect something wrong? Or what am I doing wrong with my code?

+5
source share
2 answers

I would usually not turn on a random tilt without turning on a fixed tilt. It seems that predict.merMod agrees with me because it just uses only fixed effects to predict for new levels. The documentation says that β€œprediction will use unconditional (demographic) values ​​for data with previously unobservable levels,” but these values ​​do not seem to be evaluated using the specification of your model.

So I propose this model:

 fit = lmer(y ~ x + (x|grp), data = d) newdata = data.frame(x = 1:10, grp = 4) predict(fit, newdata = newdata, allow.new.levels = TRUE) # 1 2 3 4 5 6 7 8 9 10 #1.210219 2.200685 3.191150 4.181616 5.172082 6.162547 7.153013 8.143479 9.133945 10.124410 

This is the same as using only part of the fixed effects of the model:

 t(cbind(1, newdata$x) %*% fixef(fit)) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] #[1,] 1.210219 2.200685 3.19115 4.181616 5.172082 6.162547 7.153013 8.143479 9.133945 10.12441 
+8
source

This may not be clear enough, but I think the documentation for ?predict.merMod (reasonably) states what happens when allow.new.levels=TRUE . I think ambiguity may be what "Unconditional (demographic) meanings" mean ...

allow.new.levels : boolean if new levels (or NA values) in 'newdata are allowed. If FALSE (default), such new values ​​in 'newdata will cause an error; if TRUE, then forecasting will use unconditional (demographic) values ​​for data with previously unobserved levels (or NA).

+5
source

Source: https://habr.com/ru/post/1216106/


All Articles