How to create nodes in RNeo4j using vectors or Dataframes

The popular Neo4j graph Neo4j can be used in R thanks to the RNeo4j package / driver ( https://github.com/nicolewhite/Rneo4j ).

The author of the package, @ NicoleWhite , provides some excellent examples of its use on GitHub .

Unfortunately, for me, the examples provided by @NicoleWhite and the documentation are a bit simplified, as they manually create each node graph and its associated labels and properties , such as:

 mugshots = createNode(graph, "Bar", name = "Mugshots", location = "Downtown") parlor = createNode(graph, "Bar", name = "The Parlor", location = "Hyde Park") nicole = createNode(graph, name = "Nicole", status = "Student") addLabel(nicole, "Person") 

That all is well and good when you are dealing with a tiny set of sample data, but this approach is not possible for something like a large social graph with thousands of users, where each user is a node (such graphs may not use every node in every request, but they still need to be entered in Neo4j ).

I am trying to figure out how to do this using vectors or dataframes. Is there a solution, perhaps including an apply statement or a for loop?

This basic attempt:

 for (i in 1:length(df$user_id)){ paste(df$user_id[i]) = createNode(graph, "user", name = df$name[i], email = df$email[i]) } 

Leads to Error: 400 Bad Request

+7
r neo4j graph-databases r-neo4j
source share
1 answer

As a first attempt, you should look at the functionality that I just added for the transactional endpoint:

http://nicolewhite.imtqy.com/RNeo4j/docs/transactions.html

 library(RNeo4j) graph = startGraph("http://localhost:7474/db/data/") clear(graph) data = data.frame(Origin = c("SFO", "AUS", "MCI"), FlightNum = c(1, 2, 3), Destination = c("PDX", "MCI", "LGA")) query = " MERGE (origin:Airport {name:{origin_name}}) MERGE (destination:Airport {name:{dest_name}}) CREATE (origin)<-[:ORIGIN]-(:Flight {number:{flight_num}})-[:DESTINATION]->(destination) " t = newTransaction(graph) for (i in 1:nrow(data)) { origin_name = data[i, ]$Origin dest_name = data[i, ]$Dest flight_num = data[i, ]$FlightNum appendCypher(t, query, origin_name = origin_name, dest_name = dest_name, flight_num = flight_num) } commit(t) cypher(graph, "MATCH (o:Airport)<-[:ORIGIN]-(f:Flight)-[:DESTINATION]->(d:Airport) RETURN o.name, f.number, d.name") 

Here I form the Cypher request, and then go through the data frame and pass the values ​​as parameters to the Cypher request. Your attempts will now be slow, because you are sending a separate HTTP request for each node created. Using a transactional endpoint, you create several things under one transaction. If your data frame is very large, I would split it into approximately 1000 rows per transaction.

As a second attempt, you should use LOAD CSV in neo4j shell.

+11
source share

All Articles