This is a very simple question, but I could not find a definitive answer, so I thought I would ask about it. I use the plm package to work with panel data. I am trying to use the lag function to delay the FORWARD variable in time (by default I need to get the value from the previous period, and I want the value from NEXT). I found some old articles / questions (circa 2009) suggesting that this is possible using k=-1 as an argument. However, when I try to do this, I get an error message.
Code example:
library(plm) df<-as.data.frame(matrix(c(1,1,1,2,2,3,20101231,20111231,20121231,20111231,20121231,20121231,50,60,70,120,130,210),nrow=6,ncol=3)) names(df)<-c("individual","date","data") df$date<-as.Date(as.character(df$date),format="%Y%m%d") df.plm<-pdata.frame(df,index=c("individual","date"))
Backlog:
lag(df.plm$data,0) ##returns 1-2010-12-31 1-2011-12-31 1-2012-12-31 2-2011-12-31 2-2012-12-31 3-2012-12-31 50 60 70 120 130 210 lag(df.plm$data,1) ##returns 1-2010-12-31 1-2011-12-31 1-2012-12-31 2-2011-12-31 2-2012-12-31 3-2012-12-31 NA 50 60 NA 120 NA lag(df.plm$data,-1) ##returns Error in rep(1, ak) : invalid 'times' argument
I also read that plm.data replaced pdata.frame for some applications in plm . However, plm.data does not work with the lag function at all:
df.plm<-plm.data(df,indexes=c("individual","date")) lag(df.plm$data,1) ##returns [1] 50 60 70 120 130 210 attr(,"tsp") [1] 0 5 1
I would be grateful for any help. If anyone has another suggestion for a package to use, I am all ears. However, I really love plm , because it automatically handles the lag of several individuals and skips the gaps in the time series.