I need to overload all the objects in the table. (They should be in memory, not loaded as needed, for high-speed on-demand graph traversal algorithms.)
I need to parallelize this for download speed. Thus, I want to run multiple requests in parallel threads, each pulling approx. 800 objects from the database.
QuerySplitter serves this purpose, but we work in a flexible environment, so we use the Appengine SDK, not client libraries.
MapReduce is mentioned, but it is not aimed at simple dataloading in memory. Memcache is somewhat appropriate, but for high-speed access, I need all these objects in a dense network in the RAM of my own JVM application.
MultiQueryBuilder can do this. It offers parallelism in parallel execution of request parts.
Which of these three approaches or some other approach is used, the most difficult part is to define filters or some other form of downstream flows that roughly divide the table (view) into pieces of 800 or so entities? I would create filters that say "objects from 1 to 800", "from 801 to 1600, ...", but I know that this is not practical. So how to do this?
source
share