Long-running tasks with Django

My goal is to create an application that will be able to perform lengthy mainly system tasks, for example:

  • checking code from repositories,
  • copy directories between different localizations,
  • and etc.

The problem is that I need to somehow prepare it myself from a web browser. I mean that, for example, after running the checkout / copy action, closing the web browser will not interrupt the action. Therefore, returning to this site, I see that the copying continues, or another action began when the browser was closed ...

I searched for various tools like RabbitMQ + Celery, Twisted, Pyro, XML-RPC, but I don’t know if any of them will be suitable for me. Has anyone encountered similar needs when building a Django app? Please let me know if there are any methods / packages I should know. Code samples will also be more than welcome!

Thanks in advance for your suggestions!

(And sorry for my poor English. I'm working on it.)

+8
django process
source share
3 answers

Basically, you need a process that runs outside of the request. The absolute easiest way to do this (at least on a Unix-like operating system) is fork() :

 if os.fork() == 0: do_long_thing() sys.exit(0) … continue with request … 

This has some drawbacks (although if the server crashes, the “long thing” will be lost) ... This is where, for example, Celery can come in handy. He will track the tasks that need to be completed, the results of the work (success / failure / something else) and simplify the launch of tasks on other machines.

Using celery with a Redis backend (see Kombu Redis transport) is very simple, so I would recommend looking there first.

+6
source share

You might need a process outside of the request / response cycle. If so, then the celery with the Redis backend is what I would like to consider, as it goes well with Django (as suggested by David Wolever).

Another option is to create Django management commands and then use cron to execute them at scheduled intervals.

+3
source share

Well, I managed to establish the queue mechanisms in both cases:

  • django tables (using django-celery + django-kombu) with:

     import djcelery djcelery.setup_loader() BROKER_TRANSPORT = "django" 
  • redis (using django-celery) with:

     import djcelery djcelery.setup_loader() BROKER_URL = "redis://localhost:6379/0" CELERY_RESULT_BACKEND = "redis" CELERY_REDIS_HOST = "localhost" CELERY_REDIS_PORT = 6379 CELERY_REDIS_DB = 0 

and I have three more questions:

  • HOW should I use this queue mechanism in my project, where I need to be able to release some output log in time, continue the process with several steps and be able to show the generated logs (from the very beginning) to the user, even if he opens the web browser again during process? Sorry, anwser is not obvious to me ...

  • Should I use redis config with:

     CELERY_RESULT_BACKEND = "redis" 

    or rather

     BROKER_BACKEND = "djkombu.transport.pyredis.Transport" 

    in my case?

  • Is there any difference between using:

     BROKER_TRANSPORT = "django" 

    and

     BROKER_BACKEND = "djkombu.transport.DatabaseTransport" 

    In the first case? I do not see anything...

It looks like in 2. and 3. I do not get the difference between RESULT, TRANSPORT and BROKER here. Thank you in advance for your suggestions!

+1
source share

All Articles