After working with neo4j and now, moving on to review, to make my own object manager (object manager) work with the extracted data in the application, I wonder about the output format of neo4j.
When I run the query, it always returns as tabular data. Why is this?? Of course, tables hold a large place in data and processing, but it seems strange that a graph database can only be displayed in this format.
Now, when I want to create an object graph in my application, I would need to remove all the objects, and this is not very good for performance and does not use a true graph.
Consider MATCH (A)-->(B) RETURN A, B , when there is one A and three B, it will return:
AB 1 1 1 2 1 3
The same A is transmitted 3 times over the connection to the database, and I need it only once, and I know this before the data is retrieved.
Something like this seems great. Http://nigelsmall.com/geoff load2neo is good, loading-with-neo will be nice too! either in the geoff format or in any other formats https://gephi.org/users/supported-graph-formats/
Each language can then implement its own functions for directly creating objects.
To clarify:
- Relations between nodes are lost in tabular data
- Reserve (non-optimal) format for charts
- Edges (relationships) and vertices (nodes) are usually not in the same table. (makes queries more complicated?)
Another consideration (which his own post may deserve) is that a good way to model relationships in a graph of objects? How are the objects? or how is the data / method inside node objects?
@Kikohs
Question: What do you mean by "Each language can then implement its own functions for directly creating objects."
A: Using the (partial) graph provided by the database (as a result of the query), PHP can provide a factory method (preferably in C) for constructing an object graph (usually this is an expensive operation). But only if the graph of the object is well defined in the standard format (because this function should be simple and universal).
Q: Do you want to export the full schedule or just the query result?
A: The result of the request. However, a query such as MATCH (n) OPTIONAL MATCH (n)-[r]-() RETURN n, r should return the full graph.
Q: do you want to flush a subgraph created from the query result to disk?
A: No, existing interfaces, such as REST, prefer to receive the result of the request.
Q: do you want to create a subgraph that comes from a query in memory and then query it in another language?
A: no. I want the query result in a different format to be then tabular (examples were provided)
Q: You are making a query that returns the name node, in which case would you like to get the full node bound or just the name? The same goes for the edges.
A: Nodes have no names. They have properties, labels, and relationships. I would like to get enough information to get A) the identifier of the node, it denotes its properties and B) the relation to other nodes that are in the same result.
Note that the first part of the question is not a specific βhowβ question, but rather βwhy is this impossible?β. (or, if so, I like being wrong on that). The second is a real "practical" question, namely, "how to model relationships." Both questions have in common that they both try to find the answer to the question "how to efficiently get graph data in PHP."
@Michael Hunger
You have a point when you say that not all result data can be expressed as a graph of objects. It is reasonable to say that an alternative format for outputting to a table will only supplement the format of the table and not replace it.
As far as I understand from your answer, the natural (raw) output format from the database is the result format with duplicates in it ("data streams are deleted as they arrive"). In this case, I understand that now it goes to an alternative program (dev stack) to perform the mapping. So my conclusion is about neo4j implementing something like this:
Pro - no need to do this in every implementation language (application)
Con - 1) it is impossible to map specific applications, 2) the lack of performance if the implementation language is fast
"Even if you use geoff, graphml, or the gephi format, you must keep all the data in memory to deduplicate the results."
I donβt quite understand this point, you say that these formats cannot hold deduplicated results (in some cases)? So infact that there is no possible text format with which the graph can be described without duplication?
"There are also questions about what you want to include in your release?"
I was on the assumption that the cypher language is powerful enough to indicate this in the request. And so the output format will have what the database can provide as a result.
"You can simply return the paths you receive that are unique paths through the graph itself."
Useful suggestion, I will play with this idea :)
"The neo4j shell dump team takes an approach whereby the results of cypher are output to the memory structure, enriching it."
Does the enrichment process receive additional data from the database or data already contained in the original result?