Use R to calculate cohen Kappa for a categorical rating, but within a tolerance range?

I have a number of ratings (categorically no less than 12 levels) from 2 independent appraisers. I would like to calculate mutual reliability, but considering the difference at the same level. That is, Level 1 and Level 2 will be considered an agreement, but Level 1 and Level 3 will not. I do not want to use such a measure as the correlation coefficient, because it is important to know whether the ratings are within the 1 level of the difference or not. It can be done?

edit to include sample data: each cell represents the number of expanders (max = 2), assigning an AE rating

structure(list(A = c(2, 2, 0, 0, 0,0,0,0,0), B = c(0,0,0,0,1,0,1,0,2), C = c(0,0,0,0,1,0,0,2,0), D=c(0,0,2,0,0,2,1,0,0), E=c(0,0,0,2,0,0,0,0,0)),row.names = c(NA,9L), class = "data.frame")

+4
source share
1 answer

Well, I'm not sure if this will work for you, but I think it will go a long way. Basically, you need to find an agreement between the appraisers for the various criteria of the agreement. This is really not such a big deal. In principle, either supporters agree or disagree for Cohen Kappa's purposes.

Start by creating your data:

 testdata <- structure(list(A=c(2,2,0,0,0,0,0,0,0), B=c(0,0,0,0,1,0,1,0,2), C=c(0,0,0,0,1,0,0,2,0), D=c(0,0,2,0,0,2,1,0,0), E=c(0,0,0,2,0,0,0,0,0)), row.names = c(NA,9L), class = "data.frame") 

To calculate kappa, we will use the irr package:

 library(irr) 

The kappa2 function in irr takes a 2 * n data frame or matrix and returns a calculation. Your data is in a different format, so we need to convert it to what kappa2 can handle. If you are already in this format, it will be much easier.

First, I start by creating a new data frame to get restructured results.

 new_testdata <- data.frame(R1="",R2="",stringsAsFactors=FALSE) 

Now a simple loop is sent to each line and returns a vector with the ratings of each rater. Obviously, these are not the actual ratings that were assigned; the code here simply assumes that the first rater was always considered higher than the second rater. This does not matter in this particular case, since we are only concerned with the agreement, but I hope that you have complete data.

 for(x in 1:dim(testdata)[1]) { new_testdata <- rbind(new_testdata,rep(names(testdata),testdata[x,])) } rm(x) new_testdata <- new_testdata[-1,] # Drop first, empty column 

Now we can get regular kappa.

 kappa2(ratings=new_testdata) Cohen Kappa for 2 Raters (Weights: unweighted) Subjects = 9 Raters = 2 Kappa = 0.723 z = 4.56 p-value = 5.23e-06 

Now you want to have another mouthguard where one level of disagreement is not rated as a problem. It's not a problem; basically, what you need to do is convert what is in new_testdata to a binary representation of consent or disagreement. In this case, it should not affect the kappa. (This, however, will affect the kappa if your appraisers have only two levels from which the value will be artificially closed).

To get started, create a table that converts letters to numbers. It will make our life easier.

 convtable <- data.frame(old=c("A","B","C","D","E"), new=c(1,2,3,4,5), stringsAsFactors=FALSE) 

Now we can use it to convert the values โ€‹โ€‹in new_testdata to numeric representations.

 new_testdata$R1 <- convtable$new[match(new_testdata$R1,convtable$old)] new_testdata$R2 <- convtable$new[match(new_testdata$R2,convtable$old)] 

We can easily verify agreement by simply using the difference between the two columns.

 new_testdata$diff <- abs(new_testdata$R1-new_testdata$R2) 

Then simply re-encode R1 and R2 equal to 1 and 1 for places that match your consent criteria (less than or equal to the same level of difference between the two ratings) and 1 and 0 (or 0 and 1) otherwise.

 new_testdata[new_testdata$diff<=1,c("R1","R2")] <- c(1,1) new_testdata[new_testdata$diff>1,c("R1","R2")] <- c(1,0) new_testdata <- new_testdata[1:2] # Drop the difference variable 

Now just run your mouthguard.

 kappa2(ratings=new_testdata) Cohen Kappa for 2 Raters (Weights: unweighted) Subjects = 9 Raters = 2 Kappa = 0 z = NaN p-value = NaN 

What happened? Well, the data that you gave me was basically completely agree when using the agreement as a +/- 1 level. There are some methodological problems that can occur when kappa is executed in a binary response variable, as shown in the CrossValidated post that I linked. If your data is less โ€œhomogeneousโ€ than the sample data, you should get the actual kappa value, not an abnormal zero like this. However, this is a question of more complex methods, and you may need to refer to CrossValidated.

+6
source

All Articles