Count how many times an element is repeated or not repeated in sequence (R)

Question

Count how many times an element is repeated or not repeated in sequence (R)

I have a sequence of events encoded as A, B and C. For each element, I need to count how many times this element has been repeated before, but if it does not repeat, the counter should decrease by one for each row. When the first collision of each element, the counter is to zero. For instance:

x<-c('A','A','A','B','C','C','A','B','A','C')
y<-c(0,1,2,0,0,1,-2,-4,-4,-3)
cbind(x,y)

      x   y   
 [1,] "A" "0" 
 [2,] "A" "1" 
 [3,] "A" "2" 
 [4,] "B" "0" 
 [5,] "C" "0" 
 [6,] "C" "1" 
 [7,] "A" "-2"
 [8,] "B" "-4"
 [9,] "A" "-4"
[10,] "C" "-3"

I need to create a column y from x. I know what I can use rlefor the length of the run, but I do not know how to get the time since the last meeting of a particular event to reduce the counter.

+4

r sequence

Andrey Chetverikov Jul 6 '16 at 14:28

source share

2 answers

R

positions <- sapply(unique(x),function(t) which(x %in% t))
values <- sapply(sapply(positions,diff),function(s) c(0,cumsum(ifelse(s>1,-s,s))))
df <- data.frame(positions=unlist(positions),values=unlist(values))
df[with(df,order(positions)),2]

+1

user2100721 06 . '16 15:59

Psidom · Accepted Answer · 2016-07-06T15:26:14+0000

, R . x , .

x:

library(data.table)
sepIndex <- lapply(unique(x), function(i) { 
    s = cumsum(ifelse(duplicated(rleid(x == i)) & x == i, 1, -1)) + min(which(x == i)); 
    # use `rleid` with `duplicated` to find out the duplicated elements in each block.
    # and assign `1` to each duplicated element and `-1` otherwise and use cumsum for cumulative index
    # offset the index by the initial position of the element `min(which(x == i))`
    replace(s, x != i, NA) 
})

:

sepIndex
# [[1]]
#  [1]  0  1  2 NA NA NA -2 NA -4 NA

# [[2]]
#  [1] NA NA NA  0 NA NA NA -4 NA NA

# [[3]]
#  [1] NA NA NA NA  0  1 NA NA NA -3

, Reduce, , :

Reduce(function(x, y) ifelse(is.na(x), y, x), sepIndex)
#  [1]  0  1  2  0  0  1 -2 -4 -4 -3

Count how many times an element is repeated or not repeated in sequence (R)

More articles: