TL; DR: mongoengine spends age converting all returned arrays to dicts
To test this, I created a document collection with a DictField with a large nested dict . The document is approximately in your range of 5-10 MB.
Then we can use timeit.timeit to confirm the difference in readings using pymongo and mongoengine.
Then we can use pycallgraph and GraphViz to see what Mongong is for a long time.
Here is the complete code:
import datetime import itertools import random import sys import timeit from collections import defaultdict import mongoengine as db from pycallgraph.output.graphviz import GraphvizOutput from pycallgraph.pycallgraph import PyCallGraph db.connect("test-dicts") class MyModel(db.Document): date = db.DateTimeField(required=True, default=datetime.date.today) data_dict_1 = db.DictField(required=False) MyModel.drop_collection() data_1 = ['foo', 'bar'] data_2 = ['spam', 'eggs', 'ham'] data_3 = ["subf{}".format(f) for f in range(5)] m = MyModel() tree = lambda: defaultdict(tree) # http://stackoverflow.com/a/19189366/3271558 data = tree() for _d1, _d2, _d3 in itertools.product(data_1, data_2, data_3): data[_d1][_d2][_d3] = list(random.sample(range(50000), 20000)) m.data_dict_1 = data m.save() def pymongo_doc(): return db.connection.get_connection()["test-dicts"]['my_model'].find_one() def mongoengine_doc(): return MyModel.objects.first() if __name__ == '__main__': print("pymongo took {:2.2f}s".format(timeit.timeit(pymongo_doc, number=10))) print("mongoengine took", timeit.timeit(mongoengine_doc, number=10)) with PyCallGraph(output=GraphvizOutput()): mongoengine_doc()
And the result shows that mongoengine is very slow compared to pymongo:
pymongo took 0.87s mongoengine took 25.81118331072267
The resulting call schedule clearly shows where the bottleβs throat is:

Essentially mongoengine will call the to_python method on each DictField that will be returned from db. to_python pretty slow, and in our example this is called an insane number of times.
Mongoengine is used to elegantly map your document structure for python objects. If you have very large unstructured documents (for which mongodb is great), then mongoengine is actually not the right tool, and you should just use pymongo.
However, if you know the structure, you can use the EmbeddedDocument fields to get slightly better performance from mongoengine. I executed a similar, but not equivalent test code in this value , but the output:
pymongo with dict took 0.12s pymongo with embed took 0.12s mongoengine with dict took 4.3059175412661075 mongoengine with embed took 1.1639373211854682
So you can make mongoengine faster, but pymongo is much faster.
UPDATE
A good shortcut for the pymongo interface here is to use an aggregation structure:
def mongoengine_agg_doc(): return list(MyModel.objects.aggregate({"$limit":1}))[0]