Do I need to set refit = FALSE when testing random effects in lmer () models with anova ()?

I am currently testing whether to include some random effects in my lmer model or not. For this, I use the anova function. My procedure so far is to fit the model with the lmer() function call using REML=TRUE (default option). Then I call anova() on two models in which one of them includes a random effect for testing, and the other does not. However, it is well known that the anova() function redefines the model using ML, but in the new version of anova() you can prevent anova() from doing this by setting the refit=FALSE option. To check for random effects, I have to set refit=FALSE in my call to anova() or not? (if I set refit=FALSE , p-values ​​tend to be lower. Are p values ​​anti-conservative when setting refit=FALSE ?)

Method 1:

  mod0_reml <- lmer(x ~ y + z + (1 | w), data=dat) mod1_reml <- lmer(x ~ y + z + (y | w), data=dat) anova(mod0_reml, mod1_reml) 

This will cause anova() tune models using ML instead of REML . (Newer versions of the anova() function also display information about this.)

Method 2:

  mod0_reml <- lmer(x ~ y + z + (1 | w), data=dat) mod1_reml <- lmer(x ~ y + z + (y | w), data=dat) anova(mod0_reml, mod1_reml, refit=FALSE) 

This will cause anova() to perform calculations on the original models, i.e. using REML=TRUE .

Which of the two methods is the right one to check if a random effect should be included or not?

Thanks for any help

+6
source share
1 answer

In general, I would say that it would be advisable to use refit=FALSE in this case, but let go and try an experimental experiment.

First, apply the model without accidental tilt to the sleepstudy , then model the data from this model:

 library(lme4) mod0 <- lmer(Reaction ~ Days + (1|Subject), data=sleepstudy) ## also fit the full model for later use mod1 <- lmer(Reaction ~ Days + (Days|Subject), data=sleepstudy) set.seed(101) simdat <- simulate(mod0,1000) 

Now return null data with the full and reduced model and save the distribution of p-values ​​generated with anova() with and without refit=FALSE . This is essentially a parametric bootstrap test of the null hypothesis; we want to see if it has the corresponding characteristics (i.e., a uniform distribution of p-values).

 sumfun <- function(x) { m0 <- refit(mod0,x) m1 <- refit(mod1,x) a_refit <- suppressMessages(anova(m0,m1)["m1","Pr(>Chisq)"]) a_no_refit <- anova(m0,m1,refit=FALSE)["m1","Pr(>Chisq)"] c(refit=a_refit,no_refit=a_no_refit) } 

I like plyr::laply for its convenience, although you can just as easily use a for loop or one of the other *apply approaches.

 library(plyr) pdist <- laply(simdat,sumfun,.progress="text") library(ggplot2); theme_set(theme_bw()) library(reshape2) ggplot(melt(pdist),aes(x=value,fill=Var2))+ geom_histogram(aes(y=..density..), alpha=0.5,position="identity",binwidth=0.02)+ geom_hline(yintercept=1,lty=2) ggsave("nullhist.png",height=4,width=5) 

histogram of null distributions

Error rate I type for alpha = 0.05:

 colMeans(pdist<0.05) ## refit no_refit ## 0.021 0.026 

You can see that in this case the two procedures give almost the same answer, and both procedures are very conservative for well-known reasons related to the fact that the zero value of the hypothesis test is on the border of its admissible space. For a particular case of testing one simple random effect, decreasing the p-value gives an appropriate response (see Pinheiro and Bates 2000 and others); in fact, it gives reasonable answers here, although this is not entirely justified, because here we discard two parameters of random effects (random tilt effect and the correlation between tilt and interception of random effects):

 colMeans(pdist/2<0.05) ## refit no_refit ## 0.051 0.055 

Other items:

  • You can perform a similar exercise using the PBmodcomp function from the pbkrtest package.
  • The RLRsim package RLRsim designed specifically for quick randomized (parametric bootstraps) tests of null hypotheses about the terms of random effects, but does not seem to work in this somewhat more complicated situation.
  • see the relevant GLMM faq section for similar information, including arguments for why you might not want to check the meaning of random effects at all ...
  • for additional credit, you can re-execute parametric bootstraps using deviations of deviations (-2 logarithm of likelihood) rather than p-values ​​as output, and check if the results of the mixture corresponded between a chi^2_0 (point mass at 0) and a chi^2_n (where n is probably 2, but I would not be sure about this geometry)
+5
source

All Articles