What can be done with Mongo Aggregation / Performance Mongo Aggregation

I built MongoDB. I want to make aggregation a specific grouping. I found this document that will do this for me. Everything is in order, but some restrictions are indicated:

  • Exiting the pipeline can only contain 16 megabytes. If your set result exceeds this limit, the aggregate command throws an error.

  • If any aggregation operation consumes more than 10% of the system RAM, the operation will result in an error.

  • The aggregation system currently stores $group operations in memory, which can cause problems when processing more groups.

How many lines / documents can be processed using MongoDB aggregation? I am afraid to use it. Can anyone help me on this?

+7
source share
2 answers

I got a valid and helpful answer from google groups. I would like to share with you all.

The restriction does not depend on the number of documents: the restriction is limited by the amount of memory used by the final result (or intermediate result).

So: if you summarize 200,000 documents, but the result fits into the 16 MB result, then you're fine. If you summarize 100 documents and the result does not fit in 16 MB, then you will receive an error message.

Similarly, if you are sorting () or group () on an intermediate result, and this operation requires more than 10% of the available RAM, then you will receive an error message. This is only related to how many documents you have: this is a function of how large a particular conveyor stage is.

Can I increase 16 MB using any settings?

Limit on 16 MB only for the final result OR Is this for this specific aggregation (funds, intermediate results + any temporary holdings + Final result)?

The limit of 16 MB is not regulated. This is the maximum document size in MongoDB. Since the aggregation structure is currently implemented as a team, the result of the aggregation must be returned in one document: therefore, the limit is 16 MB.

see this post

+16
source

The amount of processing that can occur in the aggregation structure depends on your design.

The aggregation structure can only display the relative indicator of one document at the moment (for more, you want to see: https://jira.mongodb.org/browse/SERVER-3253 ) and it will be displayed in the form:

 { result: { //the result }, ok: 1/0 } 

So, you must make sure that what you return from your $group / $project is not so large that you will not get the desired results. In most cases, this is not so, and a simple $group even for millions of lines can lead to a response of less than 16 megabytes.

We have no idea about the size of your documents or about the aggregate requests that you want to run, so we cannot report this.

If any aggregation operation consumes more than 10 percent of system RAM, the operation will result in an error.

This is really understandable. If the working set for the operation is so large that it takes up more than 10 percent of RAM ( $group / Calculated fields / $sort in calculated or grouped fields), then this will not work.

Unless you try to misuse the aggregation structure for your application logic, you will never run into this problem.

Currently, the aggregation system stores operations in the $ group, which can cause problems when processing more groups.

Since $group really hard not to execute in memory (it β€œgroups” the field), this means that operations with this group are also in memory, i.e. $sort here you can start using up to 10% if you are not careful.

+1
source

All Articles