Neo4j: Replace multiple node s with the same property with one node

Question

Neo4j: Replace multiple node s with the same property with one node

Say I have a node name property in neo4j. Now I want to ensure that there is at most one node for a given name, identifying all nodes with the same name. More precisely: if there are three nodes where the name is “dog”, I want them to be replaced with only one node with the name “dog”, which:

Collects all properties from all the source three nodes.
There are all arcs that were attached to the source three nodes.

The background for this is as follows: on my graph there are often several nodes with the same name, which should be considered as "equal" (although some have richer information about properties than others). Putting a.name = b.name in the WHERE clause is very slow.

EDIT: I forgot to mention that my Neo4j has version 2.3.7 at the moment (I cannot update it).

SECOND EDIT: There is a list of shortcuts for nodes and possible arcs. Known type of nodes.

THIRD EDIT: I want to call the “node collapse” procedure from Java above, so it would also be helpful to solve Cypher issues and procedural code.

+8

java graph neo4j cypher

Jf meier Aug 10 '16 at 11:12

source share

2 answers

I think you need something like a synonym for nodes.

1) Go through all the nodes and create a synonym for node:

 MATCH (N) WITH N MERGE (S:Synonym {name: N.name}) MERGE (S)<-[:hasSynonym]-(N) RETURN count(S);

2) Remove synonyms with only one node:

 MATCH (S:Synonym) WITH S MATCH (S)<-[:hasSynonym]-(N) WITH S, count(N) as count WITH S WHERE count = 1 DETACH DELETE S;

3) Transport properties and relationships for other synonyms (with apoc ):

 MATCH (S:Synonym) WITH S MATCH (S)<-[:hasSynonym]-(N) WITH [S] + collect(N) as nodesForMerge CALL apoc.refactor.mergeNodes( nodesForMerge );

4) Uncheck Synonym :

 MATCH (S:Synonym)<-[:hasSynonym]-(N) CALL apoc.create.removeLabels( [S], ['Synonym'] );

+4

stdob-- Aug 10 '16 at 11:33

source share

Ke · Accepted Answer · 2017-01-12T13:35:56+0000

I made a test file with the following circuit:

 CREATE (n1:TestX {name:'A', val1:1}) CREATE (n2:TestX {name:'B', val2:2}) CREATE (n3:TestX {name:'B', val3:3}) CREATE (n4:TestX {name:'B', val4:4}) CREATE (n5:TestX {name:'C', val5:5}) MATCH (n6:TestX {name:'A', val1:1}) MATCH (m7:TestX {name:'B', val2:2}) CREATE (n6)-[:TEST]->(m7) MATCH (n8:TestX {name:'C', val5:5}) MATCH (m10:TestX {name:'B', val3:3}) CREATE (n8)<-[:TEST]-(m10)

Which leads to the following result:

Where the nodes of B are really the same nodes. And here is my solution:

 //copy all properties MATCH (n:TestX), (m:TestX) WHERE n.name = m.name AND ID(n)<ID(m) WITH n, m SET n += m; //copy all outgoing relations MATCH (n:TestX), (m:TestX)-[r:TEST]->(endnode) WHERE n.name = m.name AND ID(n)<ID(m) WITH n, collect(endnode) as endnodes FOREACH (x in endnodes | CREATE (n)-[:TEST]->(x)); //copy all incoming relations MATCH (n:TestX), (m:TestX)<-[r:TEST]-(endnode) WHERE n.name = m.name AND ID(n)<ID(m) WITH n, collect(endnode) as endnodes FOREACH (x in endnodes | CREATE (n)<-[:TEST]-(x)); //delete duplicates MATCH (n:TestX), (m:TestX) WHERE n.name = m.name AND ID(n)<ID(m) detach delete m;

The result obtained is as follows:

It should be noted that you need to know the type of various relationships.

All properties are copied from nodes with "higher" identifiers to nodes with "lower" identifiers.

Neo4j: Replace multiple node s with the same property with one node

More articles: