How can I optimize a Neo4j MERGE request on a node with many relationships?

I have a graph with node that has many outgoing relationships. The time taken to add new outbound relationships worsens as you add new relationships. The degradation seems to be due to the time spent to verify that the relationship does not exist (I use MERGE to add the relationship).

Relationship destination nodes have very few relationships. Is there any way to get Neo4j to check for a relationship from the node destination instead of the node source?

Here we check the script to reproduce the problem. It creates a single node with id 0, followed by 1000 nodes connected to node 0 with a HAS relationship. As you add nodes, runtime increases linearly.

 CREATE CONSTRAINT ON (n:Node) ASSERT n.id IS UNIQUE UNWIND RANGE(1,1000) AS i MERGE (from:Node { id: 0 }) MERGE (to:Node { id: i}) MERGE (from)-[:HAS]->to 

Added 1001 shortcuts, created 1001 nodes, set 1001 properties, created 1000, an operator executed in 3496 ms.

 UNWIND RANGE(1001,2000) AS i MERGE (from:Node { id: 0 }) MERGE (to:Node { id: i}) MERGE (from)-[:HAS]->to 

1000 shortcuts were added, 1000 nodes were created, 1000 properties were set, 1000 were created, an operator executed in 7030 ms.

 UNWIND RANGE(2001,3000) AS i MERGE (from:Node { id: 0 }) MERGE (to:Node { id: i}) MERGE (from)-[:HAS]->to 

Added 1000 shortcuts, created 1000 nodes, set 1000 properties, created 1000, an operator executed in 10489 ms.

 UNWIND RANGE(3001,4000) AS i MERGE (from:Node { id: 0 }) MERGE (to:Node { id: i}) MERGE (from)-[:HAS]->to 

Added 1000 shortcuts, created 1000 nodes, set 1000 properties, created 1000, an operator executed in 14390 ms.

If CREATE used instead of MERGE , performance is much better. I can't use CREATE , though, since I want the relationship to be unique.

 UNWIND RANGE(4001,5000) AS i MERGE (from:Node { id: 0 }) MERGE (to:Node { id: i}) CREATE (from)-[:HAS]->to 

1000 shortcuts were added, 1000 nodes were created, 1000 properties were set, 1000 were created, the statement was executed in 413 ms.

Note. Tested with Neo4j v2.2.2

+8
neo4j cypher
source share
2 answers

This is because cypher is not smart enough to use the degree of nodes when performing a merge. The COST optimizer, which is used for reading, is already smarter, but the old RULE optimizer is used for updates.

After playing with it a little bit unsuccessfully * changing the order from and to * using CREATE UNIQUE instead of MERGE * trying to use path expressions that use get-degree in COST

I remembered that shortestPath really takes into account the degree, and also goes from left to right

So, I tried to combine this with CREATE , and it worked very well, here is an example for 100,000 nodes.

 neo4j-sh (?)$ CREATE CONSTRAINT ON (n:Node) ASSERT n.id IS UNIQUE; +-------------------+ | No data returned. | +-------------------+ Constraints added: 1 1054 ms neo4j-sh (?)$ neo4j-sh (?)$ UNWIND RANGE(0,100000) AS i CREATE (to:Node { id: i}); +-------------------+ | No data returned. | +-------------------+ Nodes created: 100001 Properties set: 100001 Labels added: 100001 2375 ms neo4j-sh (?)$ neo4j-sh (?)$ neo4j-sh (?)$ MATCH (from:Node { id: 0 }) > UNWIND RANGE(1,100000) AS i > MATCH (to:Node { id: i}) > WHERE shortestPath((to)<-[:HAS]-(from)) IS NULL > CREATE (from)-[:HAS]->(to); +-------------------+ | No data returned. | +-------------------+ Relationships created: 100000 2897 ms neo4j-sh (?)$ neo4j-sh (?)$ neo4j-sh (?)$ MATCH (from:Node { id: 0 }) > UNWIND RANGE(1,100000) AS i > MATCH (to:Node { id: i}) > WHERE shortestPath((to)<-[:HAS]-(from)) IS NULL > CREATE (from)-[:HAS]->(to); +--------------------------------------------+ | No data returned, and nothing was changed. | +--------------------------------------------+ 2360 ms 
+10
source share

Many thanks! It looks like neo4j still has a problem with merging relationships in version 3.5!

0
source share

All Articles