Is Mongoose not scalable with document array editing and version control?

I am developing a web application with Node.js and MongoDB / Mongoose. Our most used model, recording, has many subdocuments. Some of them, for example, include Comment, Orders, and Subscribers.

In the client-side application, whenever the user clicks the delete button, he launches an AJAX request on the delete route for this particular comment. The problem I am facing is that when many of these AJAX calls arrive immediately, Mongoose does not work with a "Document not found" error for some (but not all) calls.

This only happens when calls are made quickly and many at a time. I think this is due to the version in Mongoose that causes document conflicts. Our current process for removal:

  • Record.findById() document using Record.findById()
  • Remove the subdocument from the corresponding array (using, say, comment.remove() )
  • Call record.save()

I found a solution in which I can manually update the collection using Record.findByIdAndUpdate , and then using the $pull statement. However, this means that we cannot use any of the mongoose middleware and completely lose version control. And the more I think about it, the more I understand the situations when this happens, and I will have to use Mongoose wrapper functions like findByIdAndUpdate or findAndRemove . The only other solution I can think of is to put an attempt to delete into the while and hope that this works, which seems like a very bad fix.

Using Mongoose covers does not really solve my problem, as it will not allow me to use any middleware or interceptors at all, which is basically one of the huge benefits of using Mongoose.

Does this mean that Mongoose is practically useless for anything with quick editing, and I could just use my own MongoDB drivers? Do I misunderstand the limitations of Mongoose? How can I solve this problem?

+4
source share
4 answers

Editing an array with a version in the Mandoose editor does not scale for the simple reason that it is not an atomic operation. As a result, the more array editing operations you have, the greater the likelihood that two changes will occur, and you will encounter the overhead of retrying or recovering from code in your code.

For scalable manipulation of an array of documents, you should use update to update the atomic array of operators : $pull[All] , $push[All] , $pop , $addToSet and $ . Of course, you can also use these operators with atomic methods findAndModify findByIdAndUpdate and findOneAndUpdate if you also need the source or resulting document.

As you already mentioned, the big minus of using update instead of findOne + save is that none of your Mongoose middleware and checks are executed during update . But I don’t see that you have a choice if you want a scalable system. I would rather manually duplicate some intermediate information and validation logic for the update case than subject to scalable fines for using array editing with the version of the Mongoose document. Hey, at least you still get the benefits of dropping type based on Mongoose schema based on updates!

+6
source

I think from our own experience, the answer to your question is yes. Mongoose does not scale for fast array-based updates.

Background

We are facing the same problem in HabitRPG . After the recent growth of users (bringing our database to 6 GB), we began to experience VersionError for many updates based on arrays ( background on VersionError ). ensureIndex({_id:1,__v1:1}) helped a little, but it narrowed down even more users. It seems to me that Mongoose really does not scale for array-based updates. Here you can see the whole process.

Decision

If you can afford to move from an array to an object, do it. For example, comments: Schema.Types.Array => comments: Schema.Types.Mixed and sort by post.comments.{ID}.date or even manually post.comments.{ID}.position if necessary.

If you are stuck in arrays:

  • db.collection.ensureIndex({_id:1,__v:1})
  • Use the methods described above. You will not benefit from interception and verification, but there are worse things.
+2
source

I would strongly suggest listing these arrays in new collections. For example, a collection of comments in which each document has an entry identifier to indicate where it is located. This is a much more scalable solution.

You are right, the operations of the Mongoose array are not atomic and, therefore, are not scaled enough.

0
source

I thought of another idea that I'm not sure about, but it seems worth suggesting: soft-delete.

Mongoose is very concerned about changes to the structure of the array, as they make future changes ambiguous. But if you just have to mark the comment subdocument with comment.deleted=true , then you can perform more of these operations without encountering conflicts. Then you may have a cron job that goes through and actually deletes these comments.

Oh, another idea is to use some kind of memory cache, so if a record was received / edited in the last few minutes, it is available without having to pull it from the server, which means that two requests coming in at the same time will modify one and the same object.

Note. I'm not sure that any of them are good ideas in general or that they will solve your problem, so go ahead and edit / comment / reduce if they are bad :)

0
source

All Articles