I am running a script conversion that captures large amounts of data in db using Django ORM. I use manual commit to speed up the process. I have hundreds of files to commit, each file will create more than a million objects.
I am using Windows 7 64bit. I noticed that the Python process continues to grow until it consumes more than 800 MB, and this is only for the first file!
The script iterates over the entries in the text file, reuses the same variables, and does not accumulate lists or tuples.
I read here that this is a common problem for Python (and possibly for any program), but I was hoping that maybe Django or Python has an explicit way to reduce the size of the process ...
Here is a code overview:
import sys,os sys.path.append(r'D:\MyProject') os.environ['DJANGO_SETTINGS_MODULE']='my_project.settings' from django.core.management import setup_environ from convert_to_db import settings from convert_to_db.convert.models import Model1, Model2, Model3 setup_environ(settings) from django.db import transaction @transaction.commit_manually def process_file(filename): data_file = open(filename,'r') model1, created = Model1.objects.get_or_create([some condition]) if created: option.save() while 1: line = data_file.readline() if line == '': break if not(input_row_i%5000): transaction.commit() line = line[:-1]
python memory-management windows django
Jonathan Nov 27 '10 at 17:34 2010-11-27 17:34
source share