Java library for storing and processing large (up to 600 thousand vertices) charts

I am working on a project that will include the execution of algorithms on large graphs. The largest two have about 300k and 600k vertices (quite rare, I think). I hope to find a java library that can handle graphs that are large as well as smaller trees, as one of the algorithms that I will use includes decomposing the graph into a tree. Ideally, the library would also include a breadth-first search and Dijkstra or other shortest path algorithms.

Based on another question , I looked at several libraries ( JGraphT , JUNG , jdsl , yworks ), but it's hard for me to find out how many vertices they can actually handle. Looking at their documentation, all I could find was a bit in the JUNG FAQ , which said that it can easily process graphs from above 150 thousand vertices, which is still slightly smaller than my graphs ... I hope that someone here used one or more of these libraries and can tell me if it will handle the graph sizes I need or if there is some other library that will be better.

For recording, I do not need visualization tools; it is strictly about representing graphs and trees in data structures and running algorithms on them.

Background, if someone really cares: for the class I have to implement the algorithm described in the research article and start the experiments that will be performed in the document as much as possible. The paper and datasets that I will use can be found here here . My professor says that I can use any library that I can find, while I can tell what the complexity of time / space is for algorithms / data structures.

+7
source share
3 answers

You should take a look at Neo4J , which is a graphical database that can be a good solution to your problems.

+3
source

Checkout JGraph . However, it focuses on visualization.

In addition, it is possible Apache Hama - a distributed computing environment for massive scientific computing, for example, matrix, graphics and network algorithms.

Annas may also be of interest to you - an open Java infrastructure designed for developers and researchers in the field of graph theory - AI, Path Finder, Distributed Systems, etc.

+3
source

Cassovary https://github.com/twitter/cassovary -project from Twitter can handle very large graphics with Scala (thus JVM) in memory.

Alternatively, the GraphChi Java version can handle even larger charts using the disk: http://code.google.com/p/graphchi-java/

However, GraphChi will not be effective for accurate shortest path algorithms, since they require fast random access.

+1
source

All Articles