Creating nodes and links simultaneously in neo4j

Question

Creating nodes and links simultaneously in neo4j

I am trying to create a database in Neo4j with a structure that contains seven different types of nodes, about 4-5000 nodes in total and about 40,000 relationships between them. The cypher code I'm currently using is that I first create nodes with code:

Create (node1:type {name:'example1', type:'example2'})

About 4000 of this example with unique nodes.

Then I have a relationship declared as such:

 Create (node1)-[:r]-(node51), (node2)-[:r]-(node5), (node3)-[:r]-(node2);

About 40,000 of these unique relationships.

With smaller charts, this was not a problem at all. But with this, the Executing request never stops loading.

Any suggestions on how I can get this type of query to work? Or what should I do instead?

to change. What I'm trying to build is a large graph over the product, with its release, release of versions, functions, etc. Just as the example of a film count is built.

The product has about 6 releases, each release has about 20 versions. In total there are 371 functions, and out of 371 functions there are also 438 functions. (120 total), then it has about 2-300 functions each. These Featureversions are mapped to its function, which has dependencies on everything in db. I also included HW dependencies, such as a possible hw to run these functions, releases on etc., so they mostly use cypher code, for example:

 Create (Product1:Product {name:'ABC', type:'Product'}) Create (Release1:Release {name:'12A', type:'Release'}) Create (Release2:Release {name:'13A, type:'release'}) Create (ReleaseVersion1:ReleaseVersion {name:'12.0.1, type:'ReleaseVersion'}) Create (ReleaseVersion2:ReleaseVersion {name:'12.0.2, type:'ReleaseVersion'})

and below those that I structured using

 Create (Product1)<-[:Is_Version_Of]-(Release1), (Product1)<-[:Is_Version_Of]-(Release2), (Release2)<-[:Is_Version_Of]-(ReleaseVersion21),

All the way to functions, and then I also added dependencies between them, for example:

 (Feature1)-[:Requires]->(Feature239), (Feature239)-[:Requires]->(Feature51);

Since I had to find all this information from many different excel lists, etc., I made the code this way, thinking that I could just combine it into one massive cypher request and run it in / browser on localhost. It worked really well until I used more than 4-5000 requests at a time. Then he created the whole database in about 5-10 seconds, but now when I try to run about 45000 queries, at the same time it works for almost 24 hours and it still loads and says "query execution ...". I wonder if in any case I can improve the time needed to create the database? or can i do some smarter indexes or other things to improve performance? because, by the way, my cipher is written now, I can’t divide it into parts, since everything in the database has some kind of connection with the product. Do I need to rewrite the code or is there some kind of smooth path?

+7

neo4j cypher

ErikOstergren Apr 28 '14 at 9:02

source share

3 answers

FrobberOfBits · Answer 1 · 2014-04-28T15:24:34+0000

You can create several nodes and relationships associated with a single create statement, for example:

 create (a { name: "foo" })-[:HELLO]->(b {name : "bar"}), (c {name: "Baz"})-[:GOODBYE]->(d {name:"Quux"});

Thus, in order to have one approach, rather than creating each node individually using one operator, then each relationship with one expression.

You can also create several relationships between objects by matching them first and then creating:

 match (a {name: "foo"}), (d {name:"Quux"}) create (a)-[:BLAH]->(d);

Of course, you can have several matching suggestions and several creation suggestions.

You can try to map a given node type, and then create all the necessary relationships from this node type. You have enough relationships that will require a lot of requests. Make sure you specify the property that you use to match the nodes. As your database grows large, it will be important to allow a quick search for things with which you are trying to create new relationships.

You did not indicate what request you are doing, not "stop downloading". Update your question with specifics and let us know what you have tried, and maybe this will help.

Peter Neubauer · Answer 2 · 2014-04-29T08:53:20+0000

Another interesting approach could be to create your statements directly in Excel, see http://blog.bruggen.com/2013/05/reloading-my-beergraph-using-in-graph.html?view=sidebar for an example. You can run many CREATE statements in a single transaction, so this should not be overly complex.

Ben butler-cole · Answer 3 · 2014-04-29T09:32:37+0000

If you can use the preliminary steps of Neo4j 2.1, try using the new LOAD CSV and PERIODIC COMMIT . They are intended for this kind of use.

LOAD CSV allows you to describe the structure of your data with one or more Cypher templates, while simultaneously providing values in CSV to avoid duplication.

PERIODIC COMMIT can help make large imports more reliable, as well as improve performance by reducing the amount of memory needed.

Creating nodes and links simultaneously in neo4j

More articles: