There is not enough information to say a lot, but what I can say does not fit into the comment, so I will post it here; -)
First and foremost, CPython's garbage collection mainly uses reference counting. gc.collect() do nothing for you (except for recording time) if garbage objects are not involved in the reference cycles (object A can be reached from itself by following a chain of pointers transitively reachable from A ). You do not create reference loops in the code you specified, but perhaps the database layer.
So, after running gc.collect() , is memory used at all? If not, running it is pointless.
I expect that, most likely, at the database level, references to objects are stored longer than necessary, but for this you need to delve into the details of how the database level is implemented.
One way to get hints is to print the result of sys.getrefcount() applied to various large objects:
>>> import sys >>> bigobj = [1] * 1000000 >>> sys.getrefcount(bigobj) 2
As the docs say, the result is usually larger than you might expect, because the getrefcount() argument is temporarily increased by 1 simply because it is used (temporarily) as an argument.
So, if you see refcount greater than 2, del will not free the object.
Another way to get hints is to pass the object to gc.get_referrers() . This returns a list of objects that are directly related to the argument (provided that the referrer participates in the gc loop).
By the way, you need to be clearer about what you mean by the word "doesn't seem to work" and "explodes in the end." I canβt guess. What exactly is going wrong? For example, is a MemoryError raised? Something other? Traebacks often give a world of helpful tips.
Tim peters
source share