Asynchronous background processes with web2py

I need to process a large (time and memory) process asynchronously in a web2py application called inside a controller method.

My specific use case is to call a process via stdlib.subprocess and wait for it to exit without blocking the web server, but I am open to alternative methods.

  • Practical examples are a plus.
  • Third party library recommendations are welcome.
  • CRON planning is not required / required.
+7
source share
3 answers

Assuming you need to run several, possibly simultaneous, instances of the background task, the solution is a task queue. I heard good things about Celery and RabbitMQ if you are looking for third-party options, and web2py includes its own task queue , which may be sufficient for your needs.

Using any tool, you define a function that encapsulates the operation that you want the background process to perform. Then translate the work queues of tasks online. The web2py manual and forums indicate that this can be done using the @reboot operator on the web2py cron system, which runs whenever the web server starts. Perhaps there are other ways to get started if this is unsatisfactory.

In your controller, you will enter the task into the task queue, passing any necessary parameters as input for the function (the background function will not work in the same environment as the controller, so it will not have access to the session, database, etc. (unless you explicitly pass the corresponding values ​​to the task function).

Now, to get the output of the background operation to the user. When you insert a task into the task queue, you must return a unique identifier for the task. Then you must implement the controller logic (either something waiting for an AJAX call, or a page that continues to refresh until the task completes), which calls the task queue API to check the status of the specified task. If the task status is "completed", return the data to the user. If not, keep waiting.

+7
source

You might want to browse through the book's section of running tasks in the background . You can use the new scheduler or create a home queue ( example email ). There is also a web2py-celery plugin there , although I'm not sure what state it is in.

+2
source

This is harder than you might expect. Note the warnings about locking in the stdlib.subprocess documentation . It's easy if you don't mind blocking - use Popen.communicate. To bypass the lock, you can control the process using stdlib.subprocess from the thread.

My favorite way to handle subprocesses is to use Twisted spawnProcess . But it’s not easy to get Twisted to play well with other frameworks.

+1
source

All Articles