Post-hoc test for glmer

I am analyzing my binomial dataset using R using a generic linear mixed model (glmer, lme4-package). I wanted to make pairwise comparisons of a specific fixed effect ("Sound") using the Tukey post-hoc test (glht, multcomp-package).

Most of the work is working fine, but one of my fixed effect effects variables (β€œSoundC”) has no variance at all (96 times β€œ1” and zero value β€œ0”), and it seems like the Tukey test can't handle this. All paired comparisons with this "SoundC" give a value of p 1.000, while some of them are clearly significant.

As a check, I changed one of 96 "1" to "0", after which I again got the normal p values ​​and the significant differences in which I expected them, while the difference changed after my manual change.

Does anyone have a solution? If not, can I use the results of my modified dataset and report my change manually?

Playable example:

Response <- c(1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,1,0, 0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,0, 1,1,0,1,1,0,1,1,1,1,0,0,1,1,0,1,1,0,1,1,0,1,1,0,1) Data <- data.frame(Sound=rep(paste0('Sound',c('A','B','C')),22), Response, Individual=rep(rep(c('A','B'),2),rep(c(18,15),2))) # Visual boxplot(Response ~ Sound,Data) # Mixed model library (lme4) model10 <- glmer(Response~Sound + (1|Individual), Data, family=binomial) # Post-hoc analysis library (multcomp) summary(glht(model10, mcp(Sound="Tukey"))) 
+5
source share
1 answer

This borders on the CrossValidated issue; you definitely see a complete split where there is a perfect split of your answer by 0 versus 1 result. This leads to (1) infinite parameter values ​​(they are only listed as infinite due to computational imperfections) and (2) insane / useless Wald standard error values ​​and corresponding $ p $ values ​​(this is what you see here). Discussion and solutions are given here , here and here , but I will illustrate a bit below.

You need to be statistical for a moment: you really shouldn't try to set a random effect with only 3 levels (see, for example, http://glmm.wikidot.com/faq ) ...

Corrected Logistic Regression with Firth Correction:

 library(logistf) L1 <- logistf(Response~Sound*Individual,data=Data, contrasts.arg=list(Sound="contr.treatment", Individual="contr.sum")) coef se(coef) p (Intercept) 3.218876e+00 1.501111 2.051613e-04 SoundSoundB -4.653960e+00 1.670282 1.736123e-05 SoundSoundC -1.753527e-15 2.122891 1.000000e+00 IndividualB -1.995100e+00 1.680103 1.516838e-01 SoundSoundB:IndividualB 3.856625e-01 2.379919 8.657348e-01 SoundSoundC:IndividualB 1.820747e+00 2.716770 4.824847e-01 

Standard errors and p values ​​are now reasonable (the p-value for comparing A vs C is 1, because there is literally no difference ...)

Mixed Bayesian model with weak priorities:

 library(blme) model20 <- bglmer(Response~Sound + (1|Individual), Data, family=binomial, fixef.prior = normal(cov = diag(9,3))) ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 1.711485 2.233667 0.7662221 4.435441e-01 ## SoundSoundB -5.088002 1.248969 -4.0737620 4.625976e-05 ## SoundSoundC 2.453988 1.701674 1.4421024 1.492735e-01 

The diag(9,3) specification diag(9,3) a fixed-effect variance-covariance matrix gives

$$ \ left (\ Start {array} {} CCC 9 and 0 and 0 \ 0 and 9 and 0 \ 0 and 0 and 9 \ {End of array} \ right) $$

In other words, 3 determines the dimension of the matrix (equal to the number of parameters of the fixed effect), and 9 indicates the variance - this corresponds to standard deviations in 3 or 95% of the range around $ \ pm 6 $, which is quite large / weak / uninformative for logit-scaled answers .

They correspond approximately (the model is very different)

 library(multcomp) summary(glht(model20, mcp(Sound="Tukey"))) ## Estimate Std. Error z value Pr(>|z|) ## SoundB - SoundA == 0 -5.088 1.249 -4.074 0.000124 *** ## SoundC - SoundA == 0 2.454 1.702 1.442 0.309216 ## SoundC - SoundB == 0 7.542 1.997 3.776 0.000397 *** 

As I said above, I would not suggest a mixed model in this case ...

+4
source

All Articles