Topological sorting with grouping

Well, therefore, in topological sorting, depending on the input data, there are usually several correct solutions for which the order can be “processed” so that all dependencies come to them that are “dependent” on them. However, I am looking for a slightly different answer:

Suppose the following data: a -> b and c -> d ( a must be before b , and c must be before d ).
Only with these two limitations we have several possible solutions: ( abcd , acdb , cabd , etc.). However, I am looking to create a method for grouping these nodes so that after processing the group, all records in the next group depend on dependencies. For the above alleged data, I would look for a grouping such as (a, c) (b, d) . Within each group, it does not matter in which order the nodes are processed ( a to c or b to d , etc. And vice versa) only until group 1 (a, c) completed before any of groups 2 (b, d) .

The only additional catch will be that each node should be in the earliest possible group. Consider the following:
a -> b -> c
d -> e -> f
x -> y

The grouping scheme (a, d) (b, e, x) (c, f, y) would be technically correct, since x to y , a more optimal solution would be (a, d, x) (b, e, y) (c, f) , since the presence of x in group 2 means that x depended on some nodes in group 1.

Any ideas on how to do this?


EDITOR: I think I managed to put together the solution code. Thanks to everyone who helped!

 // Topological sort // Accepts: 2d graph where a [0 = no edge; non-0 = edge] // Returns: 1d array where each index is that node group_id vector<int> top_sort(vector< vector<int> > graph) { int size = graph.size(); vector<int> group_ids = vector<int>(size, 0); vector<int> node_queue; // Find the root nodes, add them to the queue. for (int i = 0; i < size; i++) { bool is_root = true; for (int j = 0; j < size; j++) { if (graph[j][i] != 0) { is_root = false; break; } } if (is_root) { node_queue.push_back(i); } } // Detect error case and handle if needed. if (node_queue.size() == 0) { cerr << "ERROR: No root nodes found in graph." << endl; return vector<int>(size, -1); } // Depth first search, updating each node with it new depth. while (node_queue.size() > 0) { int cur_node = node_queue.back(); node_queue.pop_back(); // For each node connected to the current node... for (int i = 0; i < size; i++) { if (graph[cur_node][i] == 0) { continue; } // See if dependent node needs to be updated with a later group_id if (group_ids[cur_node] + 1 > group_ids[i]) { group_ids[i] = group_ids[cur_node] + 1; node_queue.push_back(i); } } } return group_ids; } 
+7
java c ++ php graph graph-theory
source share
2 answers

Designate all root nodes with a level value of 0. Designate all children with a level value of parent + 1. If, a node is being reviewed, that is, a level value has already been assigned, check if the previously assigned value was lower than the new one. If so, update it with a higher value and pass them to descendants.

now you have as many groups as unique level 0 ... K labels

+5
source share

I recently implemented this algorithm. I started with the approach you showed, but it does not scale to graphs with 20 + million nodes. The solution I ended up with is based on the approach described here .

You can think of it as calculating the height of each node, and then the result is a group of each node at a given height.

Consider the graph:

A → X

B → X

X → Y

X → Z

Thus, the desired result is (A, B), (X), (Y, Z)

The basic approach is to find everything that uses nothing (A, B in this example). All of them are at a height of 0.

Now remove A and B from the graph, find something that now uses nothing (now X in this example). So X is at a height of 1.

Remove X from the graph, find everything that now uses nothing (now Y, Z in this example). therefore, Y, Z are at a height of 2.

You can do the optimization by realizing the fact that you do not need to store bidirectional edges for everything or actually remove anything from your graph, you only need to know the number of things pointing to the node and the nodes you know are at the next height.

So, for this example at the beginning:

  • 0 things use 1
  • 0 things use 2
  • 2 things use X (1 and 2)
  • 1 thing uses Y, Z (X)

When you visit a node, reduce the number of each of the nodes it points to, if that number goes to zero, you know that the node is at the next height.

-one
source share

All Articles