I am currently uploading a long job to TaskQueue to calculate the relationships between NDB objects in a data warehouse.
Basically, this queue processes several lists of entity keys that must be associated with another query using the node_in_connected_nodes function in GetConnectedNodes node:
class GetConnectedNodes(object): """Class for getting the connected nodes from a list of nodes in a paged way""" def __init__(self, list, query):
Here, Node has a repeating connections property that contains an array with other Node key identifiers and a corresponding sources array for this connection.
The results are stored in block storage.
Now the problem I am getting is that after iterating the join function, the memory is somehow not cleared. The following log shows the memory used by AppEngine just before creating a new instance of GetConnectedNodes :
I 2012-08-23 16:58:01.643 Prioritizing HGNC:4839 - mem 32 I 2012-08-23 16:59:21.819 Prioritizing HGNC:3003 - mem 380 I 2012-08-23 17:00:00.918 Prioritizing HGNC:8932 - mem 468 I 2012-08-23 17:00:01.424 Prioritizing HGNC:24771 - mem 435 I 2012-08-23 17:00:20.334 Prioritizing HGNC:9300 - mem 417 I 2012-08-23 17:00:48.476 Prioritizing HGNC:10545 - mem 447 I 2012-08-23 17:01:01.489 Prioritizing HGNC:12775 - mem 485 I 2012-08-23 17:01:46.084 Prioritizing HGNC:2001 - mem 564 C 2012-08-23 17:02:18.028 Exceeded soft private memory limit with 628.609 MB after servicing 1 requests total
In addition to some fluctuations, the memory simply continues to increase, although none of the previous values โโare available. It was pretty hard for me to debug this or figure out if there was a memory leak somewhere, but I seem to have traced it to this class. Would thank for any help.
Francisco roque
source share