Using R, I'm trying to trim NA values from the beginning and end of a data frame that contains multiple time series. I achieved my goal using the for loop and zoo package, but as expected, it is extremely inefficient in large data frames.
My data frame looks like this and contains 3 columns with each time series identified by its unique identifier. In this case, AAA, B and CCC.
id date value AAA 2010/01/01 NA AAA 2010/02/01 34 AAA 2010/03/01 35 AAA 2010/04/01 30 AAA 2010/05/01 NA AAA 2010/06/01 28 B 2010/01/01 NA B 2010/02/01 0 B 2010/03/01 1 B 2010/04/01 2 B 2010/05/01 3 B 2010/06/01 NA B 2010/07/01 NA B 2010/07/01 NA CCC 2010/01/01 0 CCC 2010/02/01 400 CCC 2010/03/01 300 CCC 2010/04/01 200 CCC 2010/05/01 NA
I would like to know how I can effectively remove NA values from the beginning and end of each time series, in this case AAA, B and CCC. Therefore, it should look like this.
id date value AAA 2010/02/01 34 AAA 2010/03/01 35 AAA 2010/04/01 30 AAA 2010/05/01 NA AAA 2010/06/01 28 B 2010/02/01 0 B 2010/03/01 1 B 2010/04/01 2 B 2010/05/01 3 CCC 2010/01/01 0 CCC 2010/02/01 400 CCC 2010/03/01 300 CCC 2010/04/01 200
source share