Creating ribs programmatically in ArangoDB

What is the easiest way to quickly create edges in ArangoDB?

I would like to create relationships between documents based on a common attribute. I would like to be able to select an attribute, and for each document in collection A, create an edge for each document in collection B that has the same value in the equivalent attribute.

For example, if I imported email messages into a collection and people into another collection, I would like to generate edges between messages and collections. An email scheme might look like this:

{ "_key": "subject": "body": "from": "to": } 

And the user diagram might look like this:

 { "_key": "name": "email": } 

Let's say that the values ​​in the from and to fields in email messages correspond to the email addresses that we can find in the collection of people.

I would like to be able to accept collections, attributes and edge parameters as input, and then for each document in the people collection, create an edge for each document in the email collection that has the same email address in the from attribute as an attribute of the current email document .

So far, I believe Foxx may be the best tool for this, but I'm a bit overloaded with documentation.

In the end, I would like to create a complete CRUD based on common attributes between documents defining edges, including the equivalent of "upsert" - update the edge if it already exists, and create it if it is not.

I know that doing this with separate API calls with the standard HTTP API would be too slow, since I would need to query Arango for each document in the collection and return a very large number of results.

Is there already a Foxx service that does this? If not, where should I start creating?

+1
source share
1 answer

One AQL query is enough:

 FOR p IN people FOR e IN emails FILTER p.email == e.from INSERT {_from: p._id, _to: e._id} INTO sent 

The email addresses in the people vertex collection are mapped to the from emails vertex collection email addresses. For each match, a new edge is inserted into the sent edge collection, linking people and email entries.

If both vertex collections contain a small number of documents, you can run this query without indexes (for example, 1000 people and 3000 letters took about 2 seconds in my test). For large datasets, be sure to create a hash index in the people attribute of the email attribute, and in emails , create a hash index on the from . This reduced the runtime to 30 ms in my test.

+2
source

All Articles