An effective way to find explanations of the test matrix method (mathematical problem)

Question

An effective way to find explanations of the test matrix method (mathematical problem)

Setup:

I have a boolean matrix, for example.

  10
 eleven
 0 1

where the rows are called m1 , m2 , m3 (as methods) and the columns t1 and t2 (as tests).

Definition:

An explanation is a set of rows (methods) that in a union have at least one "1" in any column (each test must match at least one method).

in our example, the set of explanations will be as follows:

{ {m2}, {m1,m2},{m1,m3},{m2,m3}, {m1,m2,m3} }

Problem:

Now I want to calculate all the explanations.

Now I already have two implementations that can solve the problem, one looking for explanations from top to bottom, the other from bottom to top, but both of them suffer from an exponential increase in computational time (doubles, increasing the number of lines by one).

Is this a known (possibly solved effectively) mathematical problem?

What can make it easier is that in the end I only need the number of occurrences in the explanations for each method. In our example, this would be for m1 three occurrences, for m2 four occurrences and for m3 three occurrences.

My current algorithms work fine until they say 26 lines. But then it becomes very slow.

Thank you for your help!

+4

performance math algorithm matrix

Mario bertschler Jul 30 '10 at 14:17

source share

3 answers

user382751 · Answer 1 · 2010-07-30T17:16:45+0000

If you can count on approximate probabilities and want something scalable, Gibbs fetching can work. The basic idea is pretty simple: start by explaining all the lines and repeat the following to try a bunch of explanations.

Select a random line.
Flip a coin.
If the coin comes to the head, add a line to the explanation (do nothing if it already exists).
If the coin comes out of the tail, try to remove the line from the explanation. If the result is not an explanation, return the string.

In the limit, the proportion of samples containing a given row converges to its true value. There are some practical implementations for the keywords “Bayesian inference using the Gibbs sample” (you have a formal form and observe that for each column the disjunction of rows, incidents with it, is true). Since I am not an expert in this matter, I cannot advise you how dangerous your own is.

dmuir · Answer 2 · 2010-08-02T12:11:07+0000

I think this is likely to be an exponential problem. For example, if one of the methods has one in each column, then any subset of the methods containing this method is an explanation, and therefore, if there are M methods, there are at least 2 ^ (M-1) explanations; similarly, if any pair of methods together has one in any column, that is, at least 2 ^ (M-2) explanations.

Here is a method that, although still exponential, I think is faster than listing all the explanations, especially when there are methods with many 1s.

Let T (A, B) be the number of subsets of A (a set of methods) that have at least one 1 in each column in B (a set of columns).

If B is empty, T (A, B) is the number of subsets of A, i.e. 2 ^ # A, where A has elements #A. Otherwise, if A is empty, T (A, B) is 0. Otherwise, if I am an element from A (for example, the first),

T (A, B) = T (A \ {i}, B \ m [i]) + T (A \ {i}, B)

(here A \ {i} is A without i, B \ m [i] - B without any of the columns in method i)

T can be encoded quite briefly as a recursive function.

Finally, c [j] is the number of times the method j occurs in the explanation,

c [j] = T (A \ {j}, C \ m [j])

where C is the set of all columns.

Svante · Answer 3 · 2010-07-30T16:31:02+0000

I do not know if there is an exponential number of possible explanations (which would mean that you cannot list them faster than the exponential).

However, you can come close to this in the style of dynamic programming to eliminate duplicate effort:

The first level is a list of one element: a set of all methods. Loop
for each level:
- for each set of this level:
  - if this set is an explanation ( or their tests together give all 1 s):
    - put it in the list of results
    - create all possible subsets of this set that have exactly one less method and put them in the next "level"
- remove all duplicates from the next level
until the level is empty

An effective way to find explanations of the test matrix method (mathematical problem)

Setup:

Definition:

Problem:

More articles: