Google App Engine - the task queue is too long to run random tasks

Our customers are having problems with our appengine python application, which requires processing the task queues to generate reports and display as soon as they are completed. This temporary solution for the well-known delay and timeout of GAE has worked well for us until recently.

Last week, we started complaining about how long users had to wait for reports. It was no more than a minute, but now it can take more than 10 minutes.

In addition, I cannot reproduce the problem, but looking at the task queue, I see that these tasks simply do not start.

Below is a screenshot of one of the queues (not the one that generates reports, but the problem occurs in all queues).

http://www.clipular.com/c/4829223501430784.png?k=QaP2kedZm6NcvrKzwVSJqq2YI1g

We see that there are no running tasks, but the only task in the queue did not start until it completed 7 minutes of waiting. And look at the ETA, he predicts that the task should begin in the past. In the end, he left, but why didn't he start earlier?

The reasons why I have already ruled out:

  • Not enough resources or instances: this happens even after midnight, when we receive only a few requests.
  • Bad queue configuration. Not that we have all the variety of queue configurations, and that happens then. For example, maximum speed = 350 / s, bucket size = 400, maximum parallel = 400.
+5
source share
1 answer

This problem stopped without any action. Apparently, this is due to some kind of failure on the GAE servers. The problem lasted about two weeks.

However, one action that can minimize the problem is to distribute your tasks in separate queues, if possible.

--- change ---

This is happening again. The only thing we could do to solve the problem is to write a script that continues to ask you to run later tasks if they are found. It works on https://console.cloud.google.com/appengine/taskqueues through tampermonkey.

0
source

All Articles