Rle-like function that catches the "start" of adjacent integers

I'm sure you all agree that rle is one of those getcha functions in R. Is there any similar function that can catch the start of adjacent integer values?

So, if I have a vector like this:

 x <- c(3:5, 10:15, 17, 22, 23, 35:40) 

and I call this esoteric function, I get an answer like this:

 lengths: 3, 6, 1, 2, 6 values: (3,4,5), (10,11,12... # you get the point 

It's not so difficult to write such a function, but still ... any ideas?

+8
r
source share
4 answers

1) Calculate the values ​​and then the lengths based on the values

 s <- split(x, cumsum(c(0, diff(x) != 1))) run.info <- list(lengths = unname(sapply(s, length)), values = unname(s)) 

Running with x from the question gives the following:

 > str(run.info) List of 2 $ lengths: int [1:5] 3 6 1 2 6 $ values :List of 5 ..$ : num [1:3] 3 4 5 ..$ : num [1:6] 10 11 12 13 14 15 ..$ : num 17 ..$ : num [1:2] 22 23 ..$ : num [1:6] 35 36 37 38 39 40 

2) Calculate lengths and then values ​​based on lengths

Here is a second solution based on calculating the length of Gregor :

 lens <- rle(x - seq_along(x))$lengths list(lengths = lens, values = unname(split(x, rep(seq_along(lens), lens)))) 

3) Calculate lengths and values ​​without using others

This seems inefficient, as it computes each of the lengths and values from scratch, and it also seems somewhat overly complicated, but it manages to reduce it all to one statement, so I thought I would add it, Its basically just a mixture of the two previous solutions marked 1) and 2) above. Nothing really new about the two.

 list(lengths = rle(x - seq_along(x))$lengths, values = unname(split(x, cumsum(c(0, diff(x) != 1))))) 

EDIT: Added second solution.

EDIT: Added third solution.

+8
source share

What about

 rle(x - 1:length(x))$lengths # 3 6 1 2 6 

Lengths are what you want, although I use an equally clever way to get the correct values, but with cumsum() and the original x they are very affordable.

+6
source share

As you say, just write something similar to rle . Indeed, setting up the code for rle by adding + 1 might give something like

 rle_consec <- function(x) { if (!is.vector(x) && !is.list(x)) stop("'x' must be an atomic vector") n <- length(x) if (n == 0L) return(structure(list(lengths = integer(), values = x), class = "rle_consec")) y <- x[-1L] != x[-n] + 1 i <- c(which(y | is.na(y)), n) structure(list(lengths = diff(c(0L, i)), values = x[i]), class = "rle_consec") } 

and using your example

 > x <- c(3:5, 10:15, 17, 22, 23, 35:40) > rle_consec(x) $lengths [1] 3 6 1 2 6 $values [1] 5 15 17 23 40 attr(,"class") [1] "rle_consec" 

as John expected.

You can customize the code further to give the first of each consecutive subsequence, rather than the last.

+5
source share

I recently posted my seqle code seqle , based on the code posted here before :-).

You can find it in to detect intervals of consecutive integer sequences.

+2
source share

All Articles