Thus, objects for a whole set of queries are loaded into memory immediately. You need to split your request into smaller digestible bits. The template for this is called false feeding. Here is a brief implementation.
def spoonfeed(qs, func, chunk=1000, start=0): ''' Chunk up a large queryset and run func on each item. Works with automatic primary key fields. chunk -- how many objects to take on at once start -- PK to start from >>> spoonfeed(Spam.objects.all(), nom_nom) ''' while start < qs.order_by('pk').last().pk: for o in qs.filter(pk__gt=start, pk__lte=start+chunk): func(o) start += chunk
To use this, you write a function that performs operations on your object:
def set_population_density(town): town.population_density = calculate_population_density(...) town.save()
and then run this function in your query:
spoonfeed(Town.objects.all(), set_population_density)
This can be further improved by using multiprocessing to execute func for multiple objects in parallel.
F. Malina Apr 04 '15 at 20:22 2015-04-04 20:22
source share