Counting Super Knots on Titan

Question

Counting Super Knots on Titan

On my system, I have a requirement that the number of edges on a node must be stored as an internal property at the vertex, as well as the vertex centering index on a specific outgoing edge. This, of course, requires me to count the number of edges per node after all the data has finished loading. I do like this:

long edgeCount = graph.getGraph().traversal().V(vertexId).bothE().count().next();

However, when I increase my tests to such an extent that some of my nodes are “super” nodes, I get the following exception in the line above:

 Caused by: com.netflix.astyanax.connectionpool.exceptions.TransportException: TransportException: [host=127.0.0.1(127.0.0.1):9160, latency=4792(4792), attempts=1]org.apache.thrift.transport.TTransportException: Frame size (70936735) larger than max length (62914560)! at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:197) ~[astyanax-thrift-3.8.0.jar!/:3.8.0] at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65) ~[astyanax-thrift-3.8.0.jar!/:3.8.0] at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) ~[astyanax-thrift-3.8.0.jar!/:3.8.0] at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:153) ~[astyanax-thrift-3.8.0.jar!/:3.8.0] at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:119) ~[astyanax-core-3.8.0.jar!/:3.8.0] at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:352) ~[astyanax-core-3.8.0.jar!/:3.8.0] at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$4.execute(ThriftColumnFamilyQueryImpl.java:538) ~[astyanax-thrift-3.8.0.jar!/:3.8.0] at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxKeyColumnValueStore.getNamesSlice(AstyanaxKeyColumnValueStore.java:112) ~[titan-cassandra-1.0.0.jar!/:na]

What is the best way to fix this? Should I just increase the frame size or is there a better way to count the number of edges per node?

+6

titan tinkerpop

Filipe teixeira Mar 24 '16 at 8:15

source share

2 answers

Such a task, which is OLAP in nature, must be performed using a distributed system without using a workaround.

There is a concept of GraphComputer in TinkerPop 3 that can be used to accomplish such a task.

This basically allows you to run Gremlin queries, which will be evaluated on multiple machines.

For example, you can use SparkGraphComputer to run your queries on top of Apache Spark .

+3

imriqwe Mar 28 '16 at 5:20

source share

Jason pllurad · Accepted Answer · 2016-03-28T15:34:26+0000

Yes, you need to increase the size of the frame. When you have a supernode, there is a really big line that should be read from the storage backend, and this is even true in the case of OLAP. I agree that if you plan to calculate this at every vertex of the graph, this is best done as an OLAP operation.

This and a few other useful tips can be found on this titan mailing list titanium . Keep in mind that the link is quite old, so the concepts are still valid, but some Titan configuration property names may be different.

Counting Super Knots on Titan

More articles: