Graphic implementations: why not use hashing?

I am preparing interviews and viewing graph implementations. The big ones that I see are a list of adjacency and adjacency matrix. When we look at the runtime of basic operations, why do I never see data structures using hashing?

In Java, for example, an adjacency list is usually ArrayList<LinkedList<Node>> , but why don't people use HashMap<Node, HashSet<Node>> ?

Let n = the number of nodes and m = the number of edges.

In both implementations, deleting node v involves searching all collections and deleting v. In the adjacency list, O (n ^ 2), but in the "adjacency set" it is O (n). Similarly, deleting an edge involves removing node u from list v and node v from list u. The adjacency list is O (n), and the adjacency set is O (1). Other operations, such as searching for successors of nodes, searching if there is a path between two nodes, etc., are the same with both implementations. Cosmic difficulties are also O (n + m).

The only drawback of the adjacency set that I can think of is that adding nodes / edges is depreciated by O (1), while in the adjacency list it is really O (1).

Perhaps I don’t see anything or forgot to take things into account when calculating battery life, so please let me know.

+7
algorithm graph runtime adjacency-list
source share
4 answers

In the same vein as DavidEisenstat's answer, the graph implementations vary widely. This is one of the things that is not very well found in the lecture. There are two conceptual projects:

 1) Adjacency list 2) Adjacency matrix 

But you can easily enlarge any design to get features such as quick insert / delete / search. Price often just stores additional data! Consider implementing a relatively simple graph algorithm (such as Euler's) and see how implementing your graph has huge consequences for the complexity of the execution.

To make my point a little clearer, I say that the adjacency list does not actually require the use of a LinkedList . For example, a wiki links to this page :

The implementation proposed by Guido van Rossum uses a hash table to combine each vertex in a graph with an array of adjacent vertices. In this view, a vertex can be represented by any hashed object. There is no explicit representation of edges as objects.

+5
source share

why don't people use HashMap<Node, HashSet<Node>> ?

If multiple plots do not exist in a single set of nodes, you can replace the HashMap with a Node member variable.

The HashSet question is more interesting than LinkedList . I would suggest that for sparse graphs, LinkedList will be more efficient both in time (for operations of equivalent asymptotic complexity) and in space. I don’t have much experience with any representation, because depending on the requirements of the algorithm, I usually prefer either (i) to store adjacency lists as consecutive subarrays, or (ii) to have an explicit object or a pair of objects for each edge that store information about the edge (for example, weight) and participates in two circular doubly linked lists (my own implementation, because the standard Java and C ++ libraries do not support intrusive data structures), making the removal of the node proportional to the degree of the node and the removal of the edge O (1).

The execution time that you specify for hashes is not in the worst case, only with a high probability against a forgotten adversary, although they can be unamortized due to a further reduction in constant factors.

+1
source share

Many problems of the theory are connected with a fixed set of vertices and edges - there is no deletion there.

Many / most graph algorithms include either simply repeating all the edges in the adjacency list, or something more complex (which requires an additional data structure).

Given the above, you get all the advantages of an array (for example, O (1) random access, effective space) for representing vertices without any disadvantages (for example, fixed size, O (n) search / index insert / delete) and all the advantages associated with list (e.g., insert O (1), effective space for an unknown number of elements) to represent edges without any drawbacks (O (n) search / random access).

But ... how about hashing?

Of course, hashing has comparable efficiency for the required operations, but constant factors are worse and there is unpredictability, since performance depends on a good hash function and well-distributed data.

Now it is not a rule that you should not use hashing, if your problem causes it, look for it.

+1
source share

We probably don’t usually see this view, because checking that an arbitrary edge on the chart is rarely required (I can’t come up with any everyday chart algorithm that relies on this), and where necessary, we can use only one hash map for the entire graph, keeping pairs (v1, v2) to represent edges. It seems more efficient.

(Most common graph algorithms say something like "for each neighbor of the vertex v, do ...", and then the adjacency list is perfect.)

0
source share

All Articles