Well, I'm not sure if this will work for you, but I think it will go a long way. Basically, you need to find an agreement between the appraisers for the various criteria of the agreement. This is really not such a big deal. In principle, either supporters agree or disagree for Cohen Kappa's purposes.
Start by creating your data:
testdata <- structure(list(A=c(2,2,0,0,0,0,0,0,0), B=c(0,0,0,0,1,0,1,0,2), C=c(0,0,0,0,1,0,0,2,0), D=c(0,0,2,0,0,2,1,0,0), E=c(0,0,0,2,0,0,0,0,0)), row.names = c(NA,9L), class = "data.frame")
To calculate kappa, we will use the irr package:
library(irr)
The kappa2 function in irr takes a 2 * n data frame or matrix and returns a calculation. Your data is in a different format, so we need to convert it to what kappa2 can handle. If you are already in this format, it will be much easier.
First, I start by creating a new data frame to get restructured results.
new_testdata <- data.frame(R1="",R2="",stringsAsFactors=FALSE)
Now a simple loop is sent to each line and returns a vector with the ratings of each rater. Obviously, these are not the actual ratings that were assigned; the code here simply assumes that the first rater was always considered higher than the second rater. This does not matter in this particular case, since we are only concerned with the agreement, but I hope that you have complete data.
for(x in 1:dim(testdata)[1]) { new_testdata <- rbind(new_testdata,rep(names(testdata),testdata[x,])) } rm(x) new_testdata <- new_testdata[-1,]
Now we can get regular kappa.
kappa2(ratings=new_testdata) Cohen Kappa for 2 Raters (Weights: unweighted) Subjects = 9 Raters = 2 Kappa = 0.723 z = 4.56 p-value = 5.23e-06
Now you want to have another mouthguard where one level of disagreement is not rated as a problem. It's not a problem; basically, what you need to do is convert what is in new_testdata to a binary representation of consent or disagreement. In this case, it should not affect the kappa. (This, however, will affect the kappa if your appraisers have only two levels from which the value will be artificially closed).
To get started, create a table that converts letters to numbers. It will make our life easier.
convtable <- data.frame(old=c("A","B","C","D","E"), new=c(1,2,3,4,5), stringsAsFactors=FALSE)
Now we can use it to convert the values โโin new_testdata to numeric representations.
new_testdata$R1 <- convtable$new[match(new_testdata$R1,convtable$old)] new_testdata$R2 <- convtable$new[match(new_testdata$R2,convtable$old)]
We can easily verify agreement by simply using the difference between the two columns.
new_testdata$diff <- abs(new_testdata$R1-new_testdata$R2)
Then simply re-encode R1 and R2 equal to 1 and 1 for places that match your consent criteria (less than or equal to the same level of difference between the two ratings) and 1 and 0 (or 0 and 1) otherwise.
new_testdata[new_testdata$diff<=1,c("R1","R2")] <- c(1,1) new_testdata[new_testdata$diff>1,c("R1","R2")] <- c(1,0) new_testdata <- new_testdata[1:2]
Now just run your mouthguard.
kappa2(ratings=new_testdata) Cohen Kappa for 2 Raters (Weights: unweighted) Subjects = 9 Raters = 2 Kappa = 0 z = NaN p-value = NaN
What happened? Well, the data that you gave me was basically completely agree when using the agreement as a +/- 1 level. There are some methodological problems that can occur when kappa is executed in a binary response variable, as shown in the CrossValidated post that I linked. If your data is less โhomogeneousโ than the sample data, you should get the actual kappa value, not an abnormal zero like this. However, this is a question of more complex methods, and you may need to refer to CrossValidated.