Dplyr suppresses the next n occurrences of a value in a group.

I recently searched for tips on how to suppress everything except the first cases of a value inside a group using dplyr ( dplyr redefines everything except the first occurrences of a value within a group ).

The solution was really smart, and now I'm afraid to find something as effective if I need to suppress only n of the following values.

For example, in the code below, I create a new column called "tag":

library('dplyr') data(iris) set.seed(1) iris$tag <- sample(c(0,1), 150, replace=TRUE, prob = c(0.7, 0.3)) giris <- iris %>% group_by(Species) # Source: local data frame [150 x 6] # Groups: Species [3] # # Sepal.Length Sepal.Width Petal.Length Petal.Width Species tag # (dbl) (dbl) (dbl) (dbl) (fctr) (dbl) # 1 5.1 3.5 1.4 0.2 setosa 0 # 2 4.9 3.0 1.4 0.2 setosa 0 # 3 4.7 3.2 1.3 0.2 setosa 0 # 4 4.6 3.1 1.5 0.2 setosa 1 # 5 5.0 3.6 1.4 0.2 setosa 0 # 6 5.4 3.9 1.7 0.4 setosa 1 # 7 4.6 3.4 1.4 0.3 setosa 1 # 8 5.0 3.4 1.5 0.2 setosa 0 # 9 4.4 2.9 1.4 0.2 setosa 0 # 10 4.9 3.1 1.5 0.1 setosa 0 # .. ... ... ... ... ... ... 

In the setos group, lines: 4, 6, 7, ... are marked as "1". I am trying to suppress "1" (ie, Convert them to "0") in the next two lines after any occurrence of "1". In other words, lines # 5 and # 6 should be set to “0”, but # 7 should remain unaffected. In this case, line # 7 has a value of "1", so for lines # 8 and # 9 should be set to "0", etc ...

Any hint on how to do this in dplyr? This package is really powerful, but for some reason it is a mental task for me to master all the subtleties ...


A few more examples: in the case of: 0 0 1 1, the output should be 0 0 1 0 in the case of: 0 0 1 1 1 1 1 1, the output should be 0 0 1 0 0 1 0

+7
r dplyr
source share
3 answers

For me, this is semantically clearer if you use cumulative reduction to track the period of refraction.

 suppress <- function(x, w) { r <- Reduce(function(d,i) if(i&!d) w else max(0,d-1), x, init=0, acc=TRUE)[-1] x * (r==w) } 

Example

 suppress(c(0,0,1,1,1,1,1), 2) #> [1] 0 0 1 0 0 1 0 
+3
source share

I can't think of a better way to do this than a loop:

 flip_followers = function(tag, nf = 2L){ w = which(tag==1L) keep = rep(TRUE, length(w)) for (i in seq_along(w)) if (keep[i]) keep[match(w[i]+seq_len(nf), w)] = FALSE tag[w[!keep]] = 0L tag } giris %>% mutate(tag = flip_followers(tag)) Source: local data frame [150 x 6] Groups: Species [3] Sepal.Length Sepal.Width Petal.Length Petal.Width Species tag (dbl) (dbl) (dbl) (dbl) (fctr) (dbl) 1 5.1 3.5 1.4 0.2 setosa 0 2 4.9 3.0 1.4 0.2 setosa 0 3 4.7 3.2 1.3 0.2 setosa 0 4 4.6 3.1 1.5 0.2 setosa 1 5 5.0 3.6 1.4 0.2 setosa 0 6 5.4 3.9 1.7 0.4 setosa 0 7 4.6 3.4 1.4 0.3 setosa 1 8 5.0 3.4 1.5 0.2 setosa 0 9 4.4 2.9 1.4 0.2 setosa 0 10 4.9 3.1 1.5 0.1 setosa 0 .. ... ... ... ... ... ... 

For possible acceleration, you can switch the loop to if (keep[i]) keep[i+seq_len(nf)][match(w[i]+seq_len(nf), w[i+seq_len(nf)])] = FALSE so that match only searches for the next nf w elements. I am sure that Rcpp will be faster if this is a serious problem.

+4
source share

Kinda clumsy, but it seems like you need to go down the vector, no matter

 f <- function(x, repl = c(1,0,0)) { sx <- seq(x) for (ii in seq_along(x)) if (x[ii] == repl[1L]) ## thanks to @Frank for catching x[ii:(ii + length(repl) - 1)] <- repl x[sx] } (x <- c(0,0,1,1,1,1,1)); f(x) # [1] 0 0 1 1 1 1 1 # [1] 0 0 1 0 0 1 0 (x <- c(0,0,1,0,1,0,1,1)); f(x) # [1] 0 0 1 0 1 0 1 1 # [1] 0 0 1 0 0 0 1 0 

And your example

 set.seed(1) head(n = 10, cbind(tag <- sample(c(0,1), 150, replace=TRUE, prob = c(0.7, 0.3)), tag2 = f(tag))) # [1,] 0 0 # [2,] 0 0 # [3,] 0 0 # [4,] 1 1 # [5,] 0 0 # [6,] 1 0 # [7,] 1 1 # [8,] 0 0 # [9,] 0 0 # [10,] 0 0 

And you can replace with whatever you want

 (x <- c(0,0,1,1,1,1,1)); f(x, c(1,0,0,0)) # [1] 0 0 1 1 1 1 1 # [1] 0 0 1 0 0 0 1 (x <- c(0,0,1,1,1,1,1)); f(x, 1:3) # [1] 0 0 1 1 1 1 1 # [1] 0 0 1 2 3 1 2 ## courtesy of @Frank this would also work (x <- c(0,0,1,1,0,0,1)); f(x, 0:2) # [1] 0 0 1 1 0 0 1 # [1] 0 1 2 1 0 1 2 
+3
source share

All Articles