Clustered standard errors other than plm vs lfe

Question

Clustered standard errors other than plm vs lfe

When I run the specification panel of the standard cluster settings using plm and lfe , I get results that differ by the second significant digit. Does anyone know why they differ in SE calculations?

 set.seed(572015) library(lfe) library(plm) library(lmtest) # clustering example x <- c(sapply(sample(1:20), rep, times = 1000)) + rnorm(20*1000, sd = 1) y <- 5 + 10*x + rnorm(20*1000, sd = 10) + c(sapply(rnorm(20, sd = 10), rep, times = 1000)) facX <- factor(sapply(1:20, rep, times = 1000)) mydata <- data.frame(y=y,x=x,facX=facX, state=rep(1:1000, 20)) model <- plm(y ~ x, data = mydata, index = c("facX", "state"), effect = "individual", model = "within") plmTest <- coeftest(model,vcov=vcovHC(model,type = "HC1", cluster="group")) lfeTest <- summary(felm(y ~ x | facX | 0 | facX)) data.frame(lfeClusterSE=lfeTest$coefficients[2], plmClusterSE=plmTest[2]) lfeClusterSE plmClusterSE 1 0.06746538 0.06572588

+5

r lfe plm

kennyB May 08 '15 at 5:00

source share

1 answer

Achim zeileis · Accepted Answer · 2015-05-08T07:27:43+0000

The difference is in regulating the degrees of freedom. This is a common first prerequisite when looking for differences in supposedly similar standard errors (see, for example, Various reliable standard logistic regression errors in Stata and R ). Here the problem can be illustrated by comparing the results of (1) plm + vcovHC , (2) felm , (3) lm + cluster.vcov (from the multiwayvcov package).

First I update all the models:

 m1 <- plm(y ~ x, data = mydata, index = c("facX", "state"), effect = "individual", model = "within") m2 <- felm(y ~ x | facX | 0 | facX, data = mydata) m3 <- lm(y ~ facX + x, data = mydata)

All lead to the same odds estimates. For m3 fixed effects are explicitly reported until they are for m1 and m2 . Therefore, with m3 only the last coefficient is extracted using tail(..., 1) .

 all.equal(coef(m1), coef(m2)) ## [1] TRUE all.equal(coef(m1), tail(coef(m3), 1)) ## [1] TRUE

Non-standard standard errors also agree.

 se <- function(object) tail(sqrt(diag(object)), 1) se(vcov(m1)) ## x ## 0.07002696 se(vcov(m2)) ## x ## 0.07002696 se(vcov(m3)) ## x ## 0.07002696

And when comparing cluster standard errors, we can now show that felm uses correction of the degree of freedom, and plm does not:

 se(vcovHC(m1)) ## x ## 0.06572423 m2$cse ## x ## 0.06746538 se(cluster.vcov(m3, mydata$facX)) ## x ## 0.06746538 se(cluster.vcov(m3, mydata$facX, df_correction = FALSE)) ## x ## 0.06572423

Clustered standard errors other than plm vs lfe

More articles: