Search for all inconsistent value combinations from multiple value lists

I have the following array containing arrays of values:

$array = array( array('1', '2'), array('a', 'b', 'c'), array('x', 'y'), ); 

There can be any number of arrays, and an array can contain any number of values. Currently, I have a piece of code that will generate all combinations where one value is taken from each array. eg:

 1ax, 1ay, 1bx, 1by, 1cx, 1cy, 2ax, 2ay, 2bx, 2by, 2cx, 2cy 

However, what I really want is only combinations in which only one value is in each column, i.e. 1ax is not good because all three values ​​1, a and x are in the first column, 1by is not good because b and y are in the second column. Therefore, from the above example, there will only be the following combinations:

 1cy, 2cx 

I initially planned to just generate all the combinations, and then filter out those with conflicts, but this does not scale, since this is a simplified example, in a real application there will be situations where there are potentially millions of combinations (including conflicting ones).

Can someone help with a better way to solve this problem? I work in PHP, but any code sample that demonstrates the logic will be useful.

Thanks in advance.


Update:

I tested solutions that work against a larger dataset to get some benchmarks, these are the results so far:

 $array = array( array('1', '2', '3', '1', '2', '3', '1', '2', '3', '1', '2', '3', '1', '2', '3'), array('a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd'), array('x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z'), array('1', '2', '3', '1', '2', '3', '1', '2', '3'), array('a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd'), array('x', 'y', 'z'), ); 

Josh Davis 2nd Solution:

 Combinations: 249480 Time: 0.3180251121521 secs Memory Usage: 22.012168884277 mb Peak Memory Usage: 22.03059387207 mb 

Josh Davis:

 Combinations: 249480 Time: 1.1172790527344 secs Memory Usage: 22.004837036133 mb Peak Memory Usage: 22.017387390137 mb 

Tom Hay:

 Combinations: 249480 Time: 5.7098741531372 secs Memory Usage: 39.145843505859 mb Peak Memory Usage: 39.145843505859 mb 
+4
source share
9 answers

This is one of those cases where self-generated code and brute force will beat most algorithms in terms of simplicity and performance. In previous answers, I saw a lot of recursion, array manipulation, and computation, when in fact what you want to do is:

 foreach ($array[0] as $k0 => $v0) { foreach ($array[1] as $k1 => $v1) { if ($k1 == $k0) { continue; } foreach ($array[2] as $k2 => $v2) { if ($k2 == $k1 || $k2 == $k0) { continue; } $result[] = $v0.$v1.$v2; } } } 

Of course, you cannot write this unless you know how many arrays are in $array . Where the generated code fits:

 $array = array( array('1', '2'), array('a', 'b', 'c'), array('x', 'y') ); $result = array(); $php = ''; foreach ($array as $i => $arr) { $php .= 'foreach ($array[' . $i . '] as $k' . $i . ' => $v' . $i . '){'; if ($i > 0) { $sep = 'if ('; $j = $i - 1; do { $php .= $sep . '$k' . $i . ' == $k' . $j; $sep = ' || '; } while (--$j >= 0); $php .= ') { continue; } '; } } $php .= '$result[] = $v' . implode('.$v', array_keys($array)) . ';' . str_repeat('}', count($array)); eval($php); print_r($result); 

Note that this routine assumes that $array is a null-based numerical index array, as in your example. It will generate the code above, adjusted for an arbitrary number of arrays.


Update

Here is an alternative algorithm. This is still spontaneous, but less rude. We still have nested loops, except that each loop runs on a copy of the array, where the keys that are currently used by the outer loops have been removed from this loop array. For example, if the values ​​should be (a, b, c), but the outer loops use indices 0 and 2, we remove "a" (index 0) and "c" (index 2), and all that remains is "b". This means that loops only work on possible combinations, and we no longer need an if condition.

In addition, and this part can be applied to the previous algorithm, we process arrays of values ​​in order from smallest to largest, to ensure that the indices used exist in the current array. The disadvantage is that it does not generate combinations in the same order. It generates the same combinations, but not in the same order. The code is as follows:

 $a0 = $array[0]; foreach ($a0 as $k0 => $v0) { $a2 = $array[2]; unset($a2[$k0]); foreach ($a2 as $k2 => $v2) { $a1 = $array[1]; unset($a1[$k0], $a1[$k2]); foreach ($a1 as $k1 => $v1) { $result[] = "$v0$v1$v2"; } } } 

The above procedure sets up a copy of the values ​​at the beginning of each cycle, and then deletes the values ​​used by the external cycles. You can improve this process by installing a copy of the values ​​only once at the beginning, delete the keys as they are used (at the beginning of each cycle) and put them back as they are released (at the end of each cycle). After that, the procedure looks like this:

 list($a0,$a1,$a2) = $array; foreach ($a0 as $k0 => $v0) { unset($a1[$k0], $a2[$k0]); foreach ($a2 as $k2 => $v2) { unset($a1[$k2]); foreach ($a1 as $k1 => $v1) { $result[] = "$v0$v1$v2"; } $a1[$k2] = $array[1][$k2]; } $a1[$k0] = $array[1][$k0]; $a2[$k0] = $array[2][$k0]; } 

The actual code that generates the source code above:

 $keys = array_map('count', $array); arsort($keys); $inner_keys = array(); foreach ($keys as $k => $cnt) { $keys[$k] = $inner_keys; $inner_keys[] = $k; } $result = array(); $php = 'list($a' . implode(',$a', array_keys($array)) . ')=$array;'; foreach (array_reverse($keys, true) as $i => $inner_keys) { $php .= 'foreach ($a' . $i . ' as $k' . $i . ' => $v' . $i . '){'; if ($inner_keys) { $php .= 'unset($a' . implode('[$k' . $i . '],$a', $inner_keys) . '[$k' . $i . ']);'; } } $php .= '$result[] = "$v' . implode('$v', array_keys($array)) . '";'; foreach ($keys as $i => $inner_keys) { foreach ($inner_keys as $j) { $php .= '$a' . $j . '[$k' . $i . ']=$array[' . $j . '][$k' . $i . "];\n"; } $php .= '}'; } eval($php); 
+3
source

An interesting problem! It turned out to be more complicated than I thought, but it seems to work.

The main strategy is to first order the arrays with the smallest in size (by tracking in which order they were, so I can output the answers in the correct order).

I save the responses as an array of indexes into this sorted array of input lists.

Now that the lists are sorted, I can save the first correct answer as an array (0,1,2, ..., n);

Then I return to the function to check all the values ​​in the first slot there (above), by replacing it with other values ​​in this answer array (all those that are not too large for this slot). Since I sorted it by size, I can move any value to the right when I exchange, without fear that it will be large for this correct slot.

the output of each valid slot has some crazy focus to cancel all sortings.

Sorry if this looks confusing. I did not spend much time cleaning it.

 <?php # $lists is an array of arrays function noconfcombos($lists) { $lengths = array(); foreach($lists as $list) { $lengths[] = count($list); } # find one solution (and make sure there is one) $answer = array(); $sorted_lengths = $lengths; asort($sorted_lengths); $answer_order_lists = array(); $answer_order_lengths = array(); $output_order = array(); $min = 1; $max_list_length = 0; foreach($sorted_lengths as $lists_key => $list_max) { if($list_max < $min) { # no possible combos return array(); } $answer[] = $min - 1; # min-1 is lowest possible value (handing out colums starting with smallest rows) $output_order[$lists_key] = $min - 1; # min-1 is which slot in $answers corresponds to this list $answer_order_lists[] = $lists[$lists_key]; $answer_order_lengths[] = $lengths[$lists_key]; ++$min; } ksort($output_order); $number_of_lists = count($lists); $max_list_length = end($sorted_lengths); if($max_list_length > $number_of_lists) { for($i = $number_of_lists; $i < $max_list_length; ++$i) { $answer[] = $i; } $stop_at = $number_of_lists; } else { $stop_at = $number_of_lists - 1; } # now $answer is valid (it has the keys into the arrays in $list for the # answer), and we can find the others by swapping around the values in # $answer. $ret = array(); $ret[] = noconfcombos_convert($answer, $answer_order_lists, $output_order); noconfcombos_recurse($ret, $max_list_length, $stop_at, $answer_order_lengths, $answer_order_lists, $output_order, $answer, 0); return $ret; } # try swapping in different indexes into position $index, from the positions # higher, then recurse function noconfcombos_recurse(&$ret, $max_list_length, $stop_at, &$lengths, &$lists, &$output_order, $answer, $index) { if($index < $stop_at) { noconfcombos_recurse($ret, $max_list_length, $stop_at, $lengths, $lists, $output_order, $answer, $index + 1); } for($other = $index + 1; $other < $max_list_length; ++$other) { if($answer[$other] < $lengths[$index]) { # && $answer[$index] < $lengths[$other]) { $tmp = $answer[$index]; $answer[$index] = $answer[$other]; $answer[$other] = $tmp; $ret[] = noconfcombos_convert($answer, $lists, $output_order); if($index < $stop_at) { noconfcombos_recurse($ret, $max_list_length, $stop_at, $lengths, $lists, $output_order, $answer, $index + 1); } } } } function noconfcombos_convert(&$indexes, &$lists, &$order) { $ret = ''; foreach($order as $i) { $ret .= $lists[$i][$indexes[$i]]; } return $ret; } function noconfcombos_test() { $a = array('1', '2', '3', '4'); $b = array('a', 'b', 'c', 'd', 'e'); $c = array('x', 'y', 'z'); $all = array($a, $b, $c); print_r(noconfcombos($all)); } noconfcombos_test(); 
+4
source

I think it works. It uses recursion to traverse a structure like a tree. For each branch, it keeps track of which columns are already made. It is probably slow because it is a brute force approach.

 <?php $array = array( array('1', '2'), array('a', 'b', 'c'), array('x', 'y'), ); function f($array, & $result, $colsToIgnore = array(), $currentPath = array()) { $row = array_shift($array); $length = count($row); $isMoreRows = !! count($array); for ($col = 0; $col < $length; $col++) { //check whether column has already been used if (in_array($col, $colsToIgnore)) { continue; } $item = $row[$col]; $tmpPath = $currentPath; $tmpPath[] = $item; if ($isMoreRows) { $tmpIgnoreCols = $colsToIgnore; $tmpIgnoreCols[] = $col; f($array, $result, $tmpIgnoreCols, $tmpPath); } else { $result[] = implode('', $tmpPath); } } } $result = array(); f($array, $result); print_r($result); 
+2
source

probably not the most elegant way, but does the trick (javascript)

 var result = []; for(i=0;i<arr1.length;i++) { for(j=0;j<arr2.length;j++) { if(j==i) continue; else { for(k=0;k<arr3.length;k++) { if(k==i||k==j) continue; else { result.push(arr1[i]+arr2[j]+arr3[k]); } } } } } 
+1
source

This can be reorganized using recursion, which allows you to work with any number of arrays. If I find time, I will try it myself.

ps. I do not know php, an example is written in Delphi.

Edit: recursive solution with arbitrary # arrays

 type TSingleArray = array of string; TMasterArray = array of TSingleArray; var solutions: array of integer; // Q&D container to hold currently used indexes of SingleArrays procedure WriteSolution(const masterArray: TMasterArray); var I: Integer; indexes: string; solution: string; begin for I := 0 to High(solutions) do begin indexes := indexes + IntToStr(solutions[I]) + ' '; solution := solution + masterArray[I][solutions[I]]; end; Writeln('Solution: ' + solution + ' Using indexes: ' + indexes); end; procedure FindSolution(const masterArray: TMasterArray; const singleArrayIndex: Integer; var bits: Integer); var I: Integer; begin for I := 0 to High(masterArray[singleArrayIndex]) do begin //***** Use bit manipulation to check if current index is already in use if bits and (1 shl I) = (1 shl I ) then continue; solutions[singleArrayIndex] := I; Inc(bits, 1 shl I); //***** If it is not the last array in our masterArray, continue by calling RecursArrays recursivly. if singleArrayIndex <> High(masterArray) then FindSolution(masterArray, Succ(singleArrayIndex), bits) else WriteSolution(masterArray); Dec(bits, 1 shl I); end; end; //*************** // Initialization //*************** var I, J: Integer; bits: Integer; singleArrayString: string; masterArray: TMasterArray; begin I := 10; SetLength(masterArray, I); for I := 0 to High(masterArray) do begin SetLength(masterArray[I], High(masterArray) + Succ(I)); singleArrayString := EmptyStr; for J := 0 to High(masterArray[I]) do begin masterArray[I][J] := IntToStr(J); singleArrayString := singleArrayString + masterArray[I][J]; end; WriteLn(singleArrayString) end; ReadLn; //****** Start solving the problem using recursion bits := 0; SetLength(solutions, Succ(High(masterArray))); FindSolution(masterArray, 0, bits); end. 
+1
source

Look at this from a different angle: to compose a row of results, you need to select a value for each column. Each value must be selected from a different row in the source. The problem is known as β€œchoose N from M” or more mathematically, Combination .

This means that the result string matches the array of indexes of the source string.

You can create all possible collections, starting to build such an index (pseudocode)

 function combinations( $source ) { if( count( $source ) == 0 ) return $source; $result=array(); // build one row foreach( $source as $index=>$value ) { $newsource = array_splice( $source, $index, 1 ); $reduced_combinations=combinations( $newsource ); foreach( $reduced_combinations as $reduced_combi ) { $newrow=array_unshift( $reduced_combi, $value ); $result[]=$newrow; } } return $result; } function retrieve_indices( $indices, $arrays ) { $result=array(); foreach( $indices as $column=>$index ) { $result[]=$arrays[$index][$column]; } return $result; } $source_arrays = array( array( "1", "2", "3" ), array( "a", "b", "c" ), array( "x", "y", "z" ) ); $index_combinations = combinations( range(0,2) ); $result=array(); foreach( $index_combinations as $combination ) { $result[]=retrieve_indices( $combination, $source_arrays ); } 
+1
source

Another option:

  $arr = array( array('1', '2'), array('a', 'b', 'c'), array('x', 'y'), ); //----- //assuming $arr consists of non empty sub-arrays function array_combinations($arr){ $max = 1; for ($i = 0; $i < count($arr); $i ++){ $max *= count($arr[$i]); } $matrix = array(); for ($i = 0; $i < $max; $i ++){ $matrix = array(); } $c_rep = 1; for ($i = count($arr) - 1; $i >= 0; $i --){ $c_rep *= ($i < count($arr) - 1)//last sub-array ? count($arr[$i + 1]) : 1; $k = 0; while ($k < $max){ for ($t = 0; $t < count($arr[$i]); $t ++){ for ($j = 0; $j < $c_rep; $j ++){ $matrix[$i][$k ++] = $arr[$i][$t]; } } } } return $matrix; } //----- $matrix = array_combinations($arr); 
+1
source

Your problem is similar to the problem of finding the determinant of a matrix . The best way imho would be to populate smaller arrays with some character, like "0", so they all have an equal number of values ​​in your example

 $array = array( array('1', '2', '0'), array('a', 'b', 'c'), array('x', 'y', '0'), ); 

then iterate over each of the first values ​​of the array and for each increment the index of the array by 1 and check the next array and the next column (in the first loop it will be β€œ1” and the index will be increased by 0 - 1, then get $ array 1 - 'b' etc.) if you reach β€œ0”, break if you reach the right border, reset the first index to 0. Then do the same with decrement and you will have all the combinations. Probably unclear, check the image associated with

0
source

try the following:

 function algorithmToCalculateCombinations($n, $elems) { if ($n > 0) { $tmp_set = array(); $res = algorithmToCalculateCombinations($n - 1, $elems); foreach ($res as $ce) { foreach ($elems as $e) { array_push($tmp_set, $ce . $e); } } return $tmp_set; } else { return array(''); } } $Elemen = array(range(0,9),range('a','z')); $Length = 3; $combinations = algorithmToCalculateCombinations($Length, $Elemen); 
0
source

All Articles