If you run your query as written without any indexes, then it will have to run two nested full collections scan, which you can see by looking at the output
db._explain(<your query here>);
which shows something like:
1 SingletonNode 1 * ROOT 2 EnumerateCollectionNode 3 - FOR fromItem IN fromCollection 3 EnumerateCollectionNode 9 - FOR toItem IN toCollection 4 CalculationNode 9 - LET
If you do
db.toCollection.ensureIndex({"type":"hash", fields ["toAttributeValue"], unique:false})`
Then fromCollection one full scan of the table collections will be carried out, and for each element found, a hash search will be found in toCollection , which will be much faster. Everything will happen in the parties, so this should already improve the situation. db._explain() will show this:
1 SingletonNode 1 * ROOT 2 EnumerateCollectionNode 3 - FOR fromItem IN fromCollection 8 IndexNode 3 - FOR toItem IN toCollection
Just working with recently inserted elements in fromCollection relatively simple: just add the import timestamp to all the vertices and use:
FOR fromItem IN fromCollection FILTER fromItem.timeStamp > @lastRun FOR toItem IN toCollection FILTER fromItem.fromAttributeValue == toItem.toAttributeValue INSERT { _from: fromItem._id, _to: toItem._id, otherAttributes: {}} INTO edgeCollection
and of course put the skiplist index in the timeStamp attribute in fromCollection .
This should work just fine to discover new vertices in fromCollection . It will “ignore” new vertices in toCollection , which are associated with old vertices in fromCollection .
You can detect them by changing the roles fromCollection and toCollection in your request (do not forget the index on fromAttributeValue in fromCollection ) and remember that you only need to place the edges if the vertex is old, as in:
FOR toItem IN toCollection FILTER toItem.timeStamp > @lastRun FOR fromItem IN fromCollection FILTER fromItem.fromAttributeValue == toItem.toAttributeValue FILTER fromItem.timeStamp <= @lastRun INSERT { _from: fromItem._id, _to: toItem._id, otherAttributes: {}} INTO edgeCollection
The two together should do what you want. Please find a fully processed example here .