I have the following code that is trying to loop over a large table (~ 100k rows, ~ 30GB)
def updateEmailsInLoop(cursor=None, stats={}):
BATCH_SIZE=10
try:
rawEmails, next_cursor, more = RawEmailModel.query().fetch_page(BATCH_SIZE, start_cursor=cursor)
for index, rawEmail in enumerate(rawEmails):
stats = process_stats(rawEmail, stats)
i = 0
while more and next_cursor:
rawEmails, next_cursor, more = RawEmailModel.query().fetch_page(BATCH_SIZE, start_cursor=next_cursor)
for index, rawEmail in enumerate(rawEmails):
stats = process_stats(rawEmail, stats)
i = (i + 1) %100
if i == 99:
logging.info("foobar: Finished 100 more %s", str(stats))
write_stats(stats)
except DeadlineExceededError:
logging.info("foobar: Deadline exceeded")
for index, rawEmail in enumerate(rawEmails[index:], start=index):
stats = process_stats(rawEmail, stats)
if more and next_cursor:
deferred.defer(updateEmailsInLoop, cursor = next_cursor, stats=stats, _queue="adminStats")
However, I keep getting the following error:
While processing this request, the process processing this request found that it was using too much memory and was interrupted. This will likely cause the new process to be used for the next request to your application. If you often see this message, you may have a memory leak in your application.
... and sometimes....
Exceeding the limited private memory limit of 128 MB with 154 MB after serving only 9 requests
I changed my code, so I always made only 10 entries at any given time, so I don’t understand why I still do not have enough memory?