Why is the calculation of coen-kappa crashing in different packages on this contingency table?

Question

Why is the calculation of coen-kappa crashing in different packages on this contingency table?

I have a contingency table for which I would like to calculate Cohens kappa - level of agreement. I tried using three different packages that did not seem to work to some extent. The e1071 package has a function specifically for the contingency table, but this also seems unsuccessful. Below reproducible code. You will need to install the concord , e1071 and irr .

 # Recreate my contingency table, output with dput conf.mat<-structure(c(810531L, 289024L, 164757L, 114316L), .Dim = c(2L, 2L), .Dimnames = structure(list(landsat_2000_bin = c("0", "1" ), MOD12_2000_binForest = c("0", "1")), .Names = c("landsat_2000_bin", "MOD12_2000_binForest")), class = "table") library(concord) cohen.kappa(conf.mat) library(e1071) classAgreement(conf.mat, match.names=TRUE) library(irr) kappa2(conf.mat)

Exiting this run:

 > cohen.kappa(conf.mat) Kappa test for nominally classified data 4 categories - 2 methods kappa (Cohen) = 0 , Z = NaN , p = NaN kappa (Siegel) = -0.333333 , Z = -0.816497 , p = 0.792892 kappa (2*PA-1) = -1 > classAgreement(conf.mat, match.names=TRUE) $diag [1] 0.6708459 $kappa [1] NA $rand [1] 0.5583764 $crand [1] 0.0594124 Warning message: In ni[lev] * nj[lev] : NAs produced by integer overflow > kappa2(conf.mat) Cohen Kappa for 2 Raters (Weights: unweighted) Subjects = 2 Raters = 2 Kappa = 0 z = NaN p-value = NaN

Can anyone advise why this might fail? I have a large dataset, but since this table is simple, I did not think it could cause such problems.

+4

r package

gisol Aug 9 '12 at 10:54

source share

2 answers

Do you need to know specifically why they fail? Here is a function that calculates statistics - in a hurry, so I can clear it later (kappa wiki) :

 kap <- function(x) { a <- (x[1,1] + x[2,2]) / sum(x) e <- (sum(x[1,]) / sum(x)) * (sum(x[,1]) / sum(x)) + (1 - (sum(x[1,]) / sum(x))) * (1 - (sum(x[,1]) / sum(x))) (ae)/(1-e) }

Test / Exit:

 > (x = matrix(c(20,5,10,15), nrow=2, byrow=T)) [,1] [,2] [1,] 20 5 [2,] 10 15 > kap(x) [1] 0.4 > (x = matrix(c(45,15,25,15), nrow=2, byrow=T)) [,1] [,2] [1,] 45 15 [2,] 25 15 > kap(x) [1] 0.1304348 > (x = matrix(c(25,35,5,35), nrow=2, byrow=T)) [,1] [,2] [1,] 25 35 [2,] 5 35 > kap(x) [1] 0.2592593 > kap(conf.mat) [1] 0.1258621

+1

lockedoff Aug 9 '12 at 17:23

source share

nograpes · Accepted Answer · 2012-08-09T17:53:58+0000

In the first cohen.kappa function cohen.kappa you need to indicate that you are using count data, not just the n*m matrix of n and m raters.

 # cohen.kappa(conf.mat,'count') cohen.kappa(conf.mat,'count')

The second function is much more complicated. For some reason, your matrix is full of integer , not numeric . integer cannot store really big numbers. So, when you multiply your two large numbers together, it fails. For instance:

 i=975288 j=1099555 class(i) # [1] "numeric" i*j # 1.072383e+12 as.integer(i)*as.integer(j) # [1] NA # Warning message: # In as.integer(i) * as.integer(j) : NAs produced by integer overflow

So, you need to convert the matrix to integers.

 # classAgreement(conf.mat) classAgreement(matrix(as.numeric(conf.mat),nrow=2))

Finally, take a look at the documentation for ?kappa2 . This requires an n*m matrix, as described above. It just won't work with your (efficient) data structure.

Why is the calculation of coen-kappa crashing in different packages on this contingency table?

More articles: