Problem statement: . Let it Sbe a set of integers, for example. S = {1,2,...,12}. M- an integer matrix, the rows of which can be considered as subsets S, the length of which divides the number of elements S, in particular each row of Mcontains only individual elements / integers. I am trying to create a matlab code that can identify all groups of disjoint rows Mwhose set-theoretic union yields S(i.e. a Section Sin fixed-size subsets) and returns each such group as a matrix.
Example: A = [1 2 5 6; 3 4 11 12; 9 10 7 8] - this is a section S, while it is B = [1 2 3 4; 5 6 7 8; 9 10 11 12; 1 5 9 12]not a section S(since the last line is not disjoint with the previous 3).
Order of magnitude: Usually it Mwill have ~ 500,000 lines, and Swill contain up to 100 elements.
The approach so far is: let m = size(M,1), n = size(M,2)(the size of the subsets of the partition), s = |S|(the size of the set to be broken) and k = s/n(the number of non-overlapping lines needed to form the partition - you can assume s = 0 mod n, therefore, the problem is correct). Note that in order to establish such a section, it is enough to check the disjointness of the lines and that they have exactly k.
In j = 1:(m-k+1)I observe ind = (sum(ismember(M((j+1):m,:),M(j,:)),2)==0), which gives me a column of row indexes after M(j,:), which also do not intersect with it. Then I create matrices 2 x nby combining M(j,:)with each of these rows. After that, I would like to repeat the procedure ismember()with all these new matrices 2 x nand continue repeating until I get the matrix k x n. Which is probably all nice and dandy, but also where I stumble, because for one the number of forouts depends on k.
Questions:
(Q1) How can I populate the code for my approach?
(Q2) Are there more efficient approaches to this problem (i.e., faster, vectorized / less forloops, with GPU support), and if so, what are they?