Shuffle data before doing glm, then repeating x times

I have some data containing a group variable (0/1) and an individual rating for approximately 2000 people. The data set looks something like this:

ID group score  
A1 1 3.5  
A2 1 3.2  
A3 0 2.8  
A4 0 2.5  

I want to check if a group variable can be predicted by count, and used the following in R:

glm(group~score,family=binomial)

Now I would like to test my p value by shuffling a group variable and then doing glm again. I would like to do this at least 10,000 times and possibly more each time I print the p value to evaluate in the file, so there is one line for each permutation. I looked at sample (), but I try to combine this with glm () and how to output only the p-value. In the script / formula, I would like to easily change the number of permutations, as well as change the glm formula if I want to add covariates.

Thanks for the help!

+4
source share
1 answer

You are on the right track.

Example (I added another value to suppress warnings about “set probabilities numerically 0 or 1”)

ex <- read.table(textConnection(
"ID group score  
A1 1 3.5  
A2 1 3.2  
A3 0 2.8  
A4 0 2.5
A5 1 2.4"),header=TRUE)

g0 <- glm(group~score,data=ex,family=binomial)

p- ( replicate, ).

pvalfun <- function() {
   g <- update(g0,data=transform(ex,group=sample(group)))
   coef(summary(g))["score","Pr(>|z|)"]
}
res <- replicate(1000,pvalfun())

library(plyr)
res <- raply(1000,pvalfun(),.progress="text")

library(glmperm)
ptest2 <- prr.test(group~score,"score",data=ex,family=binomial)
summary(ptest2)
+3

All Articles