MongoDB - how can I find all documents that are not referenced by a document from another collection

So here is the problem:

I have a document in collection A, when it was first created, no other documents refer to it. At some point, the document in the Bwill collection will be created and will reference the ObjectId of the document in collection A.

What is the best way to find all the documents in collection A that the document I am not referring to in collection B?

I understand that MongoDB does not support joins, but I wonder if there is a solution to this problem besides getting all the referenced ObjectIds from collection B and finding documents in collection A that are not on this list, since this solution probably will not scale well.

Is it possible to simply insert a document from collection A into a document from collection B and then delete it from collection A? Is this the best solution?

Thanks for your help and comments.

+4
source share
3 answers

Lots of options:

1) Add document ID B to the array in document A (backward link). Now you can search for documents that do not have any elements in this array. Problem: the array may become too large for the size of the document if you have many cross-references.

2) Add a C collection that tracks links between A and B. It behaves like a join table.

3) Simple flag in 'referenced'. When you add B, mark all A that it refers to as "links". When you delete B, do a B check for all A to which it refers, and cancel any A that no longer has a link. Problem: May exit synchronization.

4) Use the map reduce by B to create a collection containing the identifiers of all A that any B refers to. Use this collection to mark all A that are referenced (after all the marks have been marked first). May use this to fix (3) periodically.

5) Put both types of documents in the same collection and use map reduce to emit _id and flag to say "in A" or "link B". In the reduction step, find any groups that have "in A" but not "refer to B".

...

+4
source

With MongoDB 3.2, adding the $lookup operator makes this possible:

 db.a.aggregate( [ { $lookup: { from: "b", <-- secondary collection name containing references to _id of 'a' localField: "_id", <-- the _id field of the 'a' collection foreignField: "a_id", <-- the referencing field of the 'b' collection as: "references" } }, { $match: { references: [] } } ]); 

The above query will return all documents in collection a that do not have references in collection b .

Be careful with that. Performance can be a problem with large collections.


+3
source

Since there are no joins, the only options are once when you mention: either use embedded documents, or put up using two-part queries.

It depends on your implementation, but adding document type B to the corresponding document in A format sounds like the best option. This way you can get A without B using a simple query ( $ exists operator ) ...

 A.find( { B: { $exists: false } }) 
+2
source

Source: https://habr.com/ru/post/1413872/


All Articles