How to find consecutive numbers among several arrays?

I’ll give an example right away, now suppose I have 3 arrays a, b, c, such as

a = c(3,5) b = c(6,1,8,7) c = c(4,2,9) 

I should be able to extract successive triplets i, e. From them,

 c(1,2,3),c(4,5,6) 

But this was just an example, I would have a larger dataset with more than 10 arrays, therefore, it would have to find a ten-series consecutive series.

So can anyone provide an algorithm to usually find a sequential series of length 'n' among 'n' arrays.

I really do this stuff in R, so it is preferable if you pass your code to R. However, an algorithm from any language is more than welcome.

+7
arrays algorithm r permutation number-theory
source share
4 answers

First, reorganize the data into a list containing the value and number of the array. List sorting; you would have something like:

 1-2 2-3 3-1 (ie " there' sa three in array 1" ) 4-3 5-1 6-2 7-2 8-2 9-3 

Then collapse the list, check if there really are n consecutive numbers, then check if they had different array numbers

+7
source share

Here is one approach. This implies the absence of gaps in the sequence of observations in the number of groups. Here is the data.

 N <- 3 a <- c(3,5) b <- c(6,1,8,7) c <- c(4,2,9) 

Then I combine them and sort them by observation.

 dd <- lattice::make.groups(a,b,c) dd <- dd[order(dd$data),] 

Now I am looking for rows in this table where all three groups are represented

 idx <- apply(embed(as.numeric(dd$which),N), 1, function(x) { length(unique(x))==N }) 

Then we can see triplets with

 lapply(which(idx), function(i) { dd[i:(i+N-1),] }) # [[1]] # data which # b2 1 b # c2 2 c # a1 3 a # # [[2]] # data which # c1 4 c # a2 5 a # b1 6 b 
+5
source share

Here is the brute force method with expand.grid and three vectors, as in the example

 # get all combinations df <- expand.grid(a,b,c) 

Using combn to calculate the difference for each pair combination.

 # get all parwise differences myDiffs <- combn(names(df), 2, FUN=function(x) abs(x[1]-x[2])) # subset data using `rowSums` and `which` df[which(rowSums(myDiffs == 1) == ncol(myDiffs)-1), ] df[which(rowSums(myDiffs == 1) == ncol(myDiffs)-1), ] Var1 Var2 Var3 2 5 6 4 11 3 1 2 
+2
source share

I hacked into a small recursive function that finds all consecutive triplets among the many vectors that you pass (you need to go through at least three). This is probably a little rude, but it seems to work.

The function uses an ellipsis ... to pass arguments. Therefore, you will need many arguments (for example, number vectors) that you provide, and put them in the items list. Then the smallest value among each past vector is located, as well as its index.

Then the indices of the vectors corresponding to the smallest triplet are executed and repeated using the for() loop, where the output values ​​are passed to the output vector out . The input vectors in items clipped and returned to the function in a recursive manner. Only when all vectors are NA , i.e. There are more values ​​in vectors, the function returns the final result.

 library(magrittr) # define function to find the triplets tripl <- function(...){ items <- list(...) # find the smallest number in each passed vector, along with its index # output is a matrix of n-by-2, where n is the number of passed arguments triplet.id <- lapply(items, function(x){ if(is.na(x) %>% prod) id <- c(NA, NA) else id <- c(which(x == min(x)), x[which(x == min(x))]) }) %>% unlist %>% matrix(., ncol=2, byrow=T) # find the smallest triplet from the passed vectors index <- order(triplet.id[,2])[1:3] # create empty vector for output out <- vector() # go through the smallest triplet indices for(i in index){ # .. append the coresponding item from the input vector to the out vector # .. and remove the value from the input vector if(length(items[[i]]) == 1) { out <- append(out, items[[i]]) # .. if the input vector has no value left fill with NA items[[i]] <- NA } else { out <- append(out, items[[i]][triplet.id[i,1]]) items[[i]] <- items[[i]][-triplet.id[i,1]] } } # recurse until all vectors are empty (NA) if(!prod(unlist(is.na(items)))) out <- append(list(out), do.call("tripl", c(items), quote = F)) else(out <- list(out)) # return result return(out) } 

A function can be called by passing input vectors as arguments.

 # input vectors a = c(3,5) b = c(6,1,8,7) c = c(4,2,9) # find all the triplets using our function y <- tripl(a,b,c) 

The result is a list containing all the necessary information, albeit disordered.

 print(y) # [[1]] # [1] 1 2 3 # # [[2]] # [1] 4 5 6 # # [[3]] # [1] 7 9 NA # # [[4]] # [1] 8 NA NA 

You can order everything with sapply() :

 # put everything in order sapply(y, function(x){x[order(x)]}) %>% t # [,1] [,2] [,3] # [1,] 1 2 3 # [2,] 4 5 6 # [3,] 7 9 NA # [4,] 8 NA NA 

The fact is that it will use only one value for each vector to find triplets. Therefore, he will not find a consecutive triplet c(6,7,8) among, for example, c(6,7,11) , c(8,9,13) and c(10,12,14) . In this case, it will return c(6,8,10) (see below).

 a<-c(6,7,11) b<-c(8,9,13) c<-c(10,12,14) y <- tripl(a,b,c) sapply(y, function(x){x[order(x)]}) %>% t # [,1] [,2] [,3] # [1,] 6 8 10 # [2,] 7 9 12 # [3,] 11 13 14 
+1
source share

All Articles