Python framework for task execution and dependency handling

I need a structure that will allow me to do the following:

  • Allow to dynamically define tasks (I will read an external configuration file and create tasks / tasks; task = call an external command, for example)

  • Provide a way to specify dependencies on existing tasks (for example, task A will be executed after task B is completed)

  • Be able to run tasks in parallel in several processes, if this allows the order of execution (i.e. the interdependence of the task)

  • It allows the task to depend on some external event (I don’t know exactly how to describe it, but some tasks will be completed and after a while they will give results, for example, background work, I need to specify some tasks that should depend on this event, filled with background work)

  • Support undo / rollback: if one of the tasks does not work, try undoing everything that was done earlier (I do not expect this to be implemented in any environment, but I think it's worth considering ...)

So it’s obvious that it looks more or less like a build system, but I don’t seem to be able to find something that will allow me to dynamically create tasks, most of the things that I think are already defined in the “Makefile” .

Any ideas?

+8
python build-process build-automation build jobs
source share
3 answers

I did a bit more research, and I came across a doit that provides the basic functionality that I need, overkill (not to mention celery would not solve this work, but it is better for my use).

+4
source share

Another option is to use make.

  • Write the Makefile manually or let the python script write it
  • use meaningful intermediate steps in the output file
  • Run make, which should then call the processes. The processes will be a python (build) script with parameters that tell which files to work with and what to do.
  • parallel execution supported with -j
  • it also deletes output files if tasks fail

This circumvents some python parallelization issues (GIL, serialization). Obviously, this is easy on * nix platforms.

+1
source share

AFAIK, in python there is no such structure that does exactly what you are describing. Thus, your options include either creating something yourself, or hack some bits of your requirements, and model them using an existing tool. That smells like celery .

  • You may have a celery job that reads a configuration file containing the source code of the python function, then use eval or ast.literal_eval to execute them.

  • Celery provides a way to define subtasks (dependencies between tasks), so if you know about your dependencies, you can model them accordingly.

  • Provided that you know the procedure for completing your tasks, you can redirect them to as many working machines as you want.

  • You can periodically poll this background job result and then run your tasks that depend on it.

  • Undo / rollback: this can be tricky and depends on what you want to cancel; Results? state?

0
source share

All Articles