Creating group names for sequential values

Looks like an easy task, can't find an easier way. I have a vector x below and you need to create group names for consecutive values. My attempt was to use rle , the best ideas?

 # data x <- c(1,1,1,2,2,2,3,2,2,1,1) # make groups rep(paste0("Group_", 1:length(rle(x)$lengths)), rle(x)$lengths) # [1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4" # [9] "Group_4" "Group_5" "Group_5" 
+8
source share
4 answers

Using diff and cumsum :

 paste0("Group_", cumsum(c(1, diff(x) != 0))) #[1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4" "Group_4" "Group_5" "Group_5" 

(If your values ​​are floating point values, you may need to avoid != And use a tolerance instead).

+9
source

Using rleid from data.table ,

 library(data.table) paste0('Group_', rleid(x)) #[1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4" "Group_4" "Group_5" "Group_5" 
+11
source

Using cumsum but not relying on numeric data:

 paste0("Group_", 1 + c(0, cumsum(x[-length(x)] != x[-1]))) [1] "Group_1" "Group_1" "Group_1" "Group_2" "Group_2" "Group_2" "Group_3" "Group_4" "Group_4" "Group_5" "Group_5" 
+3
source

group () from groupdata2 can create groups from the list of starting points of groups using the l_starts method. By setting n to auto , it automatically finds the start of the group:

 x <- c(1,1,1,2,2,2,3,2,2,1,1) groupdata2::group(x, n = "auto", method = "l_starts") ## # A tibble: 11 x 2 ## # Groups: .groups [5] ## data .groups ## <dbl> <fct> ## 1 1 1 ## 2 1 1 ## 3 1 1 ## 4 2 2 ## 5 2 2 ## 6 2 2 ## 7 3 3 ## 8 2 4 ## 9 2 4 ## 10 1 5 ## 11 1 5 

There is also a differs_from_previous() function that finds values ​​or indices of values ​​that differ from the previous value in some threshold values.

 # The values to start groups at differs_from_previous(x, threshold = 1, direction = "both") ## [1] 2 3 2 1 # The indices to start groups at differs_from_previous(x, threshold = 1, direction = "both", return_index = TRUE) ## [1] 4 7 8 10 
+2
source

All Articles