Simplify this grid so that each row and column have 1 value

Sample code here:

> temp2 abcdefgh i 1 1 0 0 0 1 0 1 j 0 1 0 0 0 1 0 1 k 0 1 1 0 0 1 1 1 l 0 0 0 0 1 0 0 1 m 0 0 1 1 0 0 1 1 n 0 0 1 1 0 0 1 1 o 0 0 0 1 0 0 1 1 p 0 0 0 0 1 0 0 1 > dput(temp2) structure(list(a = c(1, 0, 0, 0, 0, 0, 0, 0), b = c(1, 1, 1, 0, 0, 0, 0, 0), c = c(0, 0, 1, 0, 1, 1, 0, 0), d = c(0, 0, 0, 0, 1, 1, 1, 0), e = c(0, 0, 0, 1, 0, 0, 0, 1), f = c(1, 1, 1, 0, 0, 0, 0, 0), g = c(0, 0, 1, 0, 1, 1, 1, 0), h = c(1, 1, 1, 1, 1, 1, 1, 1)), .Names = c("a", "b", "c", "d", "e", "f", "g", "h"), class = "data.frame", row.names = c("i", "j", "k", "l", "m", "n", "o", "p")) 

I have this 8x8 grid of 1s and 0s. I need to solve for some grid, where each row and each column have exactly one 1 and the rest 0, but 1 should be in the place where the original grid has 1. This is almost like a sudoku question, but not really. Any thoughts on how to get started?

I need some function that can do this for a common grid, and not just for that specific one. We can assume that there is always a lattice of solutions, given some starting grid.

Thanks!

Edit: current solution

 > temp3 abcdefgh i 1 0 0 0 0 0 0 0 j 0 1 0 0 0 0 0 0 k 0 0 0 0 0 1 0 0 l 0 0 0 0 1 0 0 0 m 0 0 0 1 0 0 0 0 n 0 0 1 0 0 0 0 0 o 0 0 0 0 0 0 1 0 p 0 0 0 0 0 0 0 1 

EDIT2: considering that only 8! unique solutions for any grid, I can try to apply brute force / match.

+7
r
source share
5 answers

This can be solved as a transfer problem or an integer programming problem. We also show a single-line solution that uses only the R base, which generates random matrices for which each row and each column of columns are summed to 1, filtering and returning those that satisfy the additional restrictions that each element of the solution matrix is ​​less than or equal to the corresponding temp2 element.

1) transportation problem Using lp.transport in lpSolve, we can solve it in one statement:

 library(lpSolve) res <- lp.transport(as.matrix(temp2), "max", rep("=", 8), rep(1, 8), rep("=", 8), rep(1, 8), integers = 0:1) res ## Success: the objective function is 8 soln <- array(res$solution, dim(temp2)) # verify all(colSums(soln)==1) && all(rowSums(soln)==1) && all(temp2>=soln) && all(soln %in% 0:1) ## [1] TRUE 

2) integer programming

If X is the solution, we specified the row and column restrictions, but did not specify the restrictions X <= temp2, since they will be satisfied automatically, since no solution puts 1, where temp2 0 can have a maximum goal of 8.

 library(lpSolve) n <- nrow(temp2) obj <- unlist(temp2) const_row <- t(sapply(1:n, function(i) c(row(temp2)) == i)) # each row sums to 1 const_col <- t(sapply(1:n, function(i) c(col(temp2)) == i)) # each col sums to 1 const.mat <- rbind(const_row, const_col) res <- lp("max", obj, const.mat, "=", 1, all.bin = TRUE) res ## Success: the objective function is 8 soln <- array(res$solution, dim(temp2)) # verify all(colSums(soln)==1) && all(rowSums(soln)==1) && all(temp2>=soln) && all(soln %in% 0:1) ## [1] TRUE 

(Note that with the same argument we could ease the problem to the linear programming problem if we add 0 <= soln [i, j] <= 1 constraints, since the same argument allowed us to omit soln [i, j] <= temp2 [i, j] limits the maximization, as a result of which the soln elements will be 0 or 1.)

2a) This approach is longer, but explicitly indicates the limitations of X <= temp2:

 n <- nrow(temp2) obj <- numeric(n*n) const1 <- diag(n*n) # soln[i,j] <= temp2[i,j] const2 <- t(sapply(1:n, function(i) c(row(temp2)) == i)) # each row sums to 1 const3 <- t(sapply(1:n, function(i) c(col(temp2)) == i)) # each col sums to 1 const.mat <- rbind(const1, const2, const3) const.dir <- rep(c("<=", "="), c(n*n, 2*n)) const.rhs <- c(unlist(temp2), rep(1, 2*n)) res <- lp("max", obj, const.mat, const.dir, const.rhs, all.bin = TRUE) res ## Success: the objective function is 0 soln <- array(res$solution, dim(temp2)) # verify all(colSums(soln)==1) && all(rowSums(soln)==1) && all(temp2>=soln) && all(soln %in% 0:1) ## [1] TRUE 

2b) Note that if X is a solution matrix, then in X <= temp2 only the positions of X corresponding to zeros in temp2 actually limit, so we could eliminate any restriction corresponding to 1 in temp2 in (2a) solution. With this change, all restrictions become limitations of equality.

 n <- nrow(temp2) obj <- numeric(n*n) const1 <- diag(n*n)[unlist(temp2) == 0, ] const2 <- t(sapply(1:n, function(i) c(row(temp2)) == i)) # each row sums to 1 const3 <- t(sapply(1:n, function(i) c(col(temp2)) == i)) # each col sums to 1 const.mat <- rbind(const1, const2, const3) const.dir <- "=" const.rhs <- c(numeric(nrow(const1)), rep(1, 2*n)) res <- lp("max", obj, const.mat, const.dir, const.rhs, all.bin = TRUE) res ## Success: the objective function is 0 soln <- array(res$solution, dim(temp2)) # verify all(colSums(soln)==1) && all(rowSums(soln)==1) && all(temp2>=soln) && all(soln %in% 0:1) ## [1] TRUE 

In fact, we could go further and remove the variables corresponding to the null elements of temp2 .

3) r2dtable Here we use rd2table to generate 10,000 8x8 tables whose rows and columns sum to 1 and then filter them to pick out only those satisfying the X < temp2 constrainsts. With rd2table to generate 10,000 8x8 tables whose rows and columns sum to 1 and then filter them to pick out only those satisfying the X < temp2 constrainsts. With temp2` of the question, and showing a random seed, found 3 solutions. If with different inputs he does not find solutions, try to create a larger number of random sentences. This approach does not use any packages.

 set.seed(123) # for reproducibility Filter(function(x) all(x <= temp2), r2dtable(10000, rep(1, 8), rep(1, 8))) 

giving:

 [[1]] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 1 0 0 0 0 0 0 0 [2,] 0 0 0 0 0 1 0 0 [3,] 0 1 0 0 0 0 0 0 [4,] 0 0 0 0 0 0 0 1 [5,] 0 0 0 0 0 0 1 0 [6,] 0 0 1 0 0 0 0 0 [7,] 0 0 0 1 0 0 0 0 [8,] 0 0 0 0 1 0 0 0 [[2]] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 1 0 0 0 0 0 0 0 [2,] 0 0 0 0 0 1 0 0 [3,] 0 1 0 0 0 0 0 0 [4,] 0 0 0 0 1 0 0 0 [5,] 0 0 0 1 0 0 0 0 [6,] 0 0 1 0 0 0 0 0 [7,] 0 0 0 0 0 0 1 0 [8,] 0 0 0 0 0 0 0 1 [[3]] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 1 0 0 0 0 0 0 0 [2,] 0 1 0 0 0 0 0 0 [3,] 0 0 0 0 0 1 0 0 [4,] 0 0 0 0 1 0 0 0 [5,] 0 0 1 0 0 0 0 0 [6,] 0 0 0 0 0 0 1 0 [7,] 0 0 0 1 0 0 0 0 [8,] 0 0 0 0 0 0 0 1 
+9
source share

Brute force method:

 m = as.matrix(temp2) w = data.frame(which(m == 1, arr.ind = TRUE)) combos = as.matrix(do.call(expand.grid, with(w, split(col, row)))) combos[ apply(combos, 1, function(x) !anyDuplicated(x)), ] 1 2 3 4 5 6 7 8 [1,] 1 6 2 8 7 3 4 5 [2,] 1 2 6 8 7 3 4 5 [3,] 1 6 2 8 3 7 4 5 [4,] 1 2 6 8 3 7 4 5 [5,] 1 6 2 8 4 3 7 5 [6,] 1 2 6 8 4 3 7 5 [7,] 1 6 2 8 3 4 7 5 [8,] 1 2 6 8 3 4 7 5 [9,] 1 6 2 5 7 3 4 8 [10,] 1 2 6 5 7 3 4 8 [11,] 1 6 2 5 3 7 4 8 [12,] 1 2 6 5 3 7 4 8 [13,] 1 6 2 5 4 3 7 8 [14,] 1 2 6 5 4 3 7 8 [15,] 1 6 2 5 3 4 7 8 [16,] 1 2 6 5 3 4 7 8 

The OP claims that you ever need to handle an 8x8 grid, so I think this works quite well. Each line of the result is a decision. The first line says that (1,1), (2,6), (3,2) ... is a solution.


Option using data.table:

 library(data.table) m = as.matrix(temp2) comboDT = setDT(melt(m))[ value == 1, do.call(CJ, split(Var2, Var1)) ][, rid := .I ][, melt(.SD, id="rid", variable.name="row", value.name="col")] setkey(comboDT, rid) comboDT[ .( comboDT[, !anyDuplicated(col), by=rid][(V1), rid]) ] 
+6
source share

it works. Let the grid be my grid (temp2 on top). then it will return a grid that works

 # create random sufficient grid counter = 0 while(2 > 1) { counter = counter + 1 if(counter == 10000) { break } rand_grid = matrix(0, nrow = 8, ncol = 8) indices_avail = seq(1,8,by=1) for(i in 1:8) { k = sample(indices_avail, 1) rand_grid[i, k] = 1 indices_avail = indices_avail[indices_avail != k] } if(sum(grid[which(rand_grid == 1)]) == 8) { break } print(counter) } 
+1
source share

This approach will return all valid combinations. First find all combinations of string combinations. Then do an exhaustive search. This method should be improved if the size of your matrix increases. One simple improvement would be to run a parallel diag test.

 st<-as.matrix(temp2) # make sure we are working with matrices ## This method will return all possible matrices of combinations ## in essence if you have diag(matr) = width matrix than you have ## a valid choice ## Helper function to build all combinations, there may be better way to ## do this but it gets the job done allCombinationsAux<-function(z,nreg,x){ if(sum(nreg)>1){ innerLoop<-do.call(rbind,lapply(x[nreg&(z!=x)], test1,nreg&(z!=x),x)) ret<-cbind(z,innerLoop ) } else{ ret<-x[nreg] } ret } ## Build all of the combinations of possible matrices combs<-do.call(rbind,lapply(x,function(y) allCombinationsAux(y,y!=x,x))) ## iterate through all the possible combinations of matrices, to find out ## which ones have 1s throughout the diag inds<-which(apply(combs,1,function(x) sum(diag(st[x,]))==8)) lapply(inds,function(x) st[combs[x,],]) 
+1
source share

Despite the fact that there are already great answers to the approach with brute force and actually using mathematics, just for strikes, here is a version that guesses and checks for lags of inconsistent columns. For the example in question, this is really quite fast, and as a bonus, you can find a new answer to any particular run! So funny! To code:

 set.seed(47) # remove this to have more fun mat.in <- as.matrix(temp2) # we'll work in matrices mat.out <- diag(8) # a starting guess dimnames(mat.out) <- dimnames(mat.in) # make our answer pretty iteration <- 1 # for kicks, a loop counter while (any((mat.out != mat.in)[as.logical(mat.out)])) { mat.ref <- mat.out mat.out <- mat.out[, sample(8)] # make this deterministic if you like inner <- 1 # don't repeat yourself (too much) while (any(mat.out != mat.ref) & inner <= 8) { mat.ref <- mat.out # find non-matching indices and lag those columns to.lag <- which((mat.out != mat.in)[as.logical(mat.out)]) i <- 1:8 i[to.lag] <- c(to.lag[length(to.lag)], to.lag[-length(to.lag)]) mat.out <- mat.out[, i] cat(inner, " ") # let see what it does inner <- inner + 1 } print(iteration) # whoo, scrolling numbers iteration <- iteration + 1 } ## 1 2 3 [1] 1 ## 1 2 3 4 5 6 7 8 [1] 2 ## 1 2 [1] 3 ## 1 2 3 [1] 4 

which for this particular seed returns

 mat.out ## acegdbfh ## i 1 0 0 0 0 0 0 0 ## j 0 0 0 0 0 1 0 0 ## k 0 1 0 0 0 0 0 0 ## l 0 0 0 0 1 0 0 0 ## m 0 0 1 0 0 0 0 0 ## n 0 0 0 0 0 0 1 0 ## o 0 0 0 1 0 0 0 0 ## p 0 0 0 0 0 0 0 1 

This, of course, could be further optimized, but it is already quite fast (without printing, which slows it down):

 Unit: microseconds expr min lq mean median uq max neval let guess 137.796 383.6445 838.2327 693.819 1163.08 2510.436 100 

works all 100 times in a split second. This is pretty quick than actually guessing (chopping the inner loop):

 Unit: microseconds expr min lq mean median uq max neval cld guess smart 148.997 349.916 848.6314 588.162 1085.841 3117.78 100 a actually guess 322.458 7341.961 31197.1237 20012.969 47677.501 160250.02 100 b 

Please note that this luck plays a role here, and if there are fewer solutions, it will take more time. If there are no solutions, it will work forever. Of course, this could be optimized in order to avoid such a fate by making sure that it does not use the repeated initial permutation provided by sample(8) (a good idea, regardless of what I considered redundant here, since it only goes through a few permutations each works anyway). Take it away.

0
source share

All Articles