Add edge for direct acyclic graph with other restrictions

I have a DAG. I have this operation to add an edge between two nodes.

If A is available from B, then B is the parent. If A can be reached from B without going through another node, then B is the direct parent.

Requirements for this schedule:

  • No cycles.
  • For any node, there is a list of direct parents P [1], P [2], P [3] ... P [i] is not a parent of P [j] for any i and k.

When adding an edge, requirement 1 is not satisfied, the edge is not built. When adding an edge, requirement 2 is not satisfied, the edge is built, but the direct parents will be modified so that requirement 2 is satisfied.

For example, there are 3 nodes

  • A, direct parents: none
  • B, direct parents: A
  • C, direct parents: A

now if i add an edge between B and C we have

  • C, direct parents: A, B

but A is the parent of B, does not meet requirement 2, therefore A is removed from the direct parent of C, and we have

  • C, direct parents: B

Currently, I have done the following: Add an edge from A to B (this A will become the parent of B)

  • Check if B is the parent of the BFS. If so, don't add an edge (make sure there are no loops)
  • Check if AB is already B the parent of the BFS. If so, do not add an edge.
  • Find the intersection of the parent with the direct parent of B. This is done by looking for each parent of A, although BFS. Remove the intersection from direct parent B and add A as direct parent of B. (2 and 3 make sure it meets requirement 2)

It is slow. It breaks down to the 5k node level (I'm looking for this to process any graph with nodes less than 100k), the speed becomes unacceptable, 0.02 seconds to add a node edge.

I have a feeling that steps 1 and 2 can be done in 1 step using some other algorithm.

I was thinking about using topological ordering, but it should cross the entire graph, which is the worst case for my steps 1 and 2. When adding a new node, the order will be out of order. therefore, I have to run the topological ordering for the insert each time, so it does not bring any benefit. For step 3, you need to find the entire set of parents. The process is rather slow, since on average it is across a decent part of the schedule.

How can I make it more efficient?

+4
source share
2 answers

Your question boils down to "Is it possible to insert edges in a DAG faster than O (v + e)?" according to requirement (1). Requirement (2) is a more local constraint that does not require verification of the entire schedule.

I think the answer is no: you cannot do better than O(v+e) in the worst case (where v is the number of nodes / vertices and e is the number of edges).

Sure, there are tricks to improving your expected performance, depending on the properties of your DAG and how it changes over time. This is apparently an active research topic. For example, I believe that for some graphs this may be useful for cluster nodes. Inserting edges within a cluster requires only checks within the sub-DAG cluster. But then you need the right clustering strategy with support for cheap clustering updates when adding nodes, etc.

+4
source

I don’t know about your implantation, but I recommend adding indexing. Each row index must store metaparent-metachild pairs

metaparent - parent element in the link: parent, parent-of-parent, ...

metachid - is a child of the link: child, child-of-child, ...

therefore, for the graph A-> B-> C, the following indices exist: AB, BC, AC Adding an explicit edge between B-> C causes a statement, because such a record already exists. Thus, the complexity of the algorithm reduces from n ^ 2 to ln (n)

0
source

All Articles