I have a very simple task. After migrating and adding a new field (repeating and compound property) to an existing NDB entity (~ 100K entities) I need to set the default value for it.
I tried this code first:
q = dm.E.query(ancestor=dm.E.root_key) for user in q.iter(batch_size=500): user.field1 = [dm.E2()] user.put()
But he is not with such errors:
2015-04-25 20:41:44.792 /**** 500 599830ms 0kb AppEngine-Google; (+http://code.google.com/appengine) module=default version=1-17-0 W 2015-04-25 20:32:46.675 suspended generator run_to_queue(query.py:938) raised Timeout(The datastore operation timed out, or the data was temporarily unavailable.) W 2015-04-25 20:32:46.676 suspended generator helper(context.py:876) raised Timeout(The datastore operation timed out, or the data was temporarily unavailable.) E 2015-04-25 20:41:44.475 Traceback (most recent call last): File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 267, in
The task is performed in a separate task queue, so it takes at least 10 minutes to complete it, but it seems to be insufficient. Strange another: warnings from NDB. Maybe there is a deadlock due to updates for the same Creatures from other instances (user initiated), but not sure.
In any case, I want to know the best practices (and the simplest ones) for such a task. I know about MapReduce, but he is currently looking for a task too complex for me.
UPDATE:
I also tried using put_multi , capturing all the entities in the array, but GAE stops the instance as soon as it exceeds ~ 600 MB of memory (with a limit of 500 MB). There seems to be insufficient memory to store all objects (~ 100K).
source share