Data table removing leading missing values by group

Question

Data table removing leading missing values by group

The following is an example of a data table in which I would like to delete rows where the value is NA, and none of the earlier rows have a value, that is, NA by group. Since not everyone has the same number of leading missing values, I'm stuck and not looking for luck.

Example Data Table

    group    date     value
      a 2015-01-01    NA
      a 2015-01-02     2
      a 2015-01-03     3
      a 2015-01-04    NA
      a 2015-01-05     2
      b 2015-01-01    NA
      b 2015-01-02    NA
      b 2015-01-03     2
      b 2015-01-04    NA
      b 2015-01-05     2

Ready data table

    group    date     value
      a 2015-01-02     2
      a 2015-01-03     3
      a 2015-01-04    NA
      a 2015-01-05     2
      b 2015-01-03     2
      b 2015-01-04    NA
      b 2015-01-05     2

Later I plan to attribute the missing values to those that come before and after.

EDIT: A previously asked question found here that is similar.

+4

r data.table

Zachary Jun 11 '15 at 3:36

source share

1 answer

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2015-06-11T03:48:23+0000

The main approach would be to use whichand .Nfor example:

DT[, .SD[(which(!is.na(value))[1]):.N], by = group]
##    group       date value
## 1:     a 2015-01-02     2
## 2:     a 2015-01-03     3
## 3:     a 2015-01-04    NA
## 4:     a 2015-01-05     2
## 5:     b 2015-01-03     2
## 6:     b 2015-01-04    NA
## 7:     b 2015-01-05     2

Data table removing leading missing values ​​by group

More articles:

Data table removing leading missing values by group