Google AppEngine - Big Data Warehouse

Question

Google AppEngine - Big Data Warehouse

I need to read all the entries in the Google AppEngine datastore in order to do some initialization work. Currently, there are many objects (80 thousand), and it continues to grow. I'm starting to use the 30 second storage timeout timeout timeout.

Are there any recommendations on how to trick these types of huge reads into the data warehouse? Any examples?

+4

google-app-engine google-cloud-datastore

user1617999 Aug 23 '12 at 19:55

source share

2 answers

This StackExchange answer served me well:

Expired requests and appengine

I had to modify it slightly to work for me:

 def loop_over_objects_in_batches(batch_size, object_class, callback): num_els = object_class.count() num_loops = num_els / batch_size remainder = num_els - num_loops * batch_size logging.info("Calling batched loop with batch_size: %d, num_els: %s, num_loops: %s, remainder: %s, object_class: %s, callback: %s," % (batch_size, num_els, num_loops, remainder, object_class, callback)) offset = 0 while offset < num_loops * batch_size: logging.info("Processing batch (%d:%d)" % (offset, offset+batch_size)) query = object_class[offset:offset + batch_size] for q in query: callback(q) offset = offset + batch_size if remainder: logging.info("Processing remainder batch (%d:%d)" % (offset, num_els)) query = object_class[offset:num_els] for q in query: callback(q)

0

user1649180 10 sept. '12 at 20:56

source share

Peter Knego · Accepted Answer · 2012-08-24T06:56:43+0000

You can solve this in several ways:

Run your code on Task Queue , which has a timeout of 10min instead of 30 seconds (more than 60 seconds in practice). The easiest way to do this is DeferredTask .
Warning : DeferredTask must be serializable, so it is difficult to pass complex data to it. Also do not make this an inner class.
See backends . Requests made using the backend instance have no time limit.
Finally, if you need to break up a large task and execute it in parallel, than look at mapreduce .

Google AppEngine - Big Data Warehouse

More articles: