Classification functions in linear discriminant analysis in R

Question

Classification functions in linear discriminant analysis in R

After completing the linear discriminant analysis in R using lda() there a convenient way to extract the classification functions for each group?

From the link

This should not be confused with discriminant functions. Classification functions can be used to determine which group most likely is the most likely case. There are as many classification functions as there are groups. Each function allows us to calculate classification points for each case for each group, using the formula:

 Si = ci + wi1*x1 + wi2*x2 + ... + wim*xm

In this formula, the index i denotes the corresponding group; indices 1, 2, ..., m denote m variables; ci is the constant for the i-th group, wij is the weight for the j-th variable when calculating the classification score for the i-th group; xj is the observed value for the corresponding case for the jth variable. Si is the final classification score.
We can use classification functions to directly compute classification estimates for some new observations.

I can build them from scratch using the tutorial formulas, but for this I need to rebuild a number of intermediate steps from lda analysis. Is there any way to get them after the fact from the lda object?

Added:

If I still do not understand something in Brandon's answer (sorry for the confusion!), It seems that the answer is no. Presumably, most users can get the necessary information from predict() , which provides classifications based on lda() .

For learning-related reasons, more than a serious research need, I really wanted to see the actual classification functions, and for posterity there is a function that adds them to the lda() result:

 ty.lda <- function(x, groups){ x.lda <- lda(groups ~ ., as.data.frame(x)) gr <- length(unique(groups)) ## groups might be factors or numeric v <- ncol(x) ## variables m <- x.lda$means ## group means w <- array(NA, dim = c(v, v, gr)) for(i in 1:gr){ tmp <- scale(subset(x, groups == unique(groups)[i]), scale = FALSE) w[,,i] <- t(tmp) %*% tmp } W <- w[,,1] for(i in 2:gr) W <- W + w[,,i] V <- W/(nrow(x) - gr) iV <- solve(V) class.funs <- matrix(NA, nrow = v + 1, ncol = gr) colnames(class.funs) <- paste("group", 1:gr, sep=".") rownames(class.funs) <- c("constant", paste("var", 1:v, sep = ".")) for(i in 1:gr) { class.funs[1, i] <- -0.5 * t(m[i,]) %*% iV %*% (m[i,]) class.funs[2:(v+1) ,i] <- iV %*% (m[i,]) } x.lda$class.funs <- class.funs return(x.lda) }

This code follows the formulas in Legendre and Legendre Numerical Ecology (1998), page 625, and corresponds to the results of the processed example, starting on page 626.

+8

r classification

Tyler Apr 12 '11 at 1:50

source share

2 answers

Brandon bertelsen · Answer 1 · 2011-04-12T02:43:06+0000

Suppose x is your LDA object:

 x$terms

You may have a peak of an object, if you look at its structure:

 str(x)

Update:

 Iris <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]),Sp = rep(c("s","c","v"), rep(50,3))) train <- sample(1:150, 75) table(Iris$Sp[train]) z <- lda(Sp ~ ., Iris, prior = c(1,1,1)/3, subset = train) predict(z, Iris[-train, ])$class str(z) List of 10 $ prior : Named num [1:3] 0.333 0.333 0.333 ..- attr(*, "names")= chr [1:3] "c" "s" "v" $ counts : Named int [1:3] 30 25 20 ..- attr(*, "names")= chr [1:3] "c" "s" "v" $ means : num [1:3, 1:4] 6.03 5.02 6.72 2.81 3.43 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:3] "c" "s" "v" .. ..$ : chr [1:4] "Sepal.L." "Sepal.W." "Petal.L." "Petal.W." $ scaling: num [1:4, 1:2] 0.545 1.655 -1.609 -3.682 -0.443 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:4] "Sepal.L." "Sepal.W." "Petal.L." "Petal.W." .. ..$ : chr [1:2] "LD1" "LD2" $ lev : chr [1:3] "c" "s" "v" $ svd : num [1:2] 33.66 2.93 $ N : int 75 $ call : language lda(formula = Sp ~ ., data = Iris, prior = c(1, 1, 1)/3, subset = train) $ terms :Classes 'terms', 'formula' length 3 Sp ~ Sepal.L. + Sepal.W. + Petal.L. + Petal.W. .. ..- attr(*, "variables")= language list(Sp, Sepal.L., Sepal.W., Petal.L., Petal.W.) .. ..- attr(*, "factors")= int [1:5, 1:4] 0 1 0 0 0 0 0 1 0 0 ... .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. ..$ : chr [1:5] "Sp" "Sepal.L." "Sepal.W." "Petal.L." ... .. .. .. ..$ : chr [1:4] "Sepal.L." "Sepal.W." "Petal.L." "Petal.W." .. ..- attr(*, "term.labels")= chr [1:4] "Sepal.L." "Sepal.W." "Petal.L." "Petal.W." .. ..- attr(*, "order")= int [1:4] 1 1 1 1 .. ..- attr(*, "intercept")= int 1 .. ..- attr(*, "response")= int 1 .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv .. ..- attr(*, "predvars")= language list(Sp, Sepal.L., Sepal.W., Petal.L., Petal.W.) .. ..- attr(*, "dataClasses")= Named chr [1:5] "factor" "numeric" "numeric" "numeric" ... .. .. ..- attr(*, "names")= chr [1:5] "Sp" "Sepal.L." "Sepal.W." "Petal.L." ... $ xlevels: Named list() - attr(*, "class")= chr "lda"

42- · Answer 2 · 2012-12-29T18:14:56+0000

I think your question was wrong ... Well, maybe it wasn’t wrong, but at least it was a little misleading. The discriminant function refers to the distances between groups, therefore there is no function associated with one group, but rather a function that describes the distances between any two centroids of the group. I simply answered a later question and posted an example of calculating the estimation function using the diaphragm dataset and using it to indicate cases in the 2d predictor graph. In the case of group analysis, the function will be greater than zero for one group and less than zero for another group.

Classification functions in linear discriminant analysis in R

More articles: