I have a number of tasks to be completed; There is no dependence between tasks. I am looking for a tool to help me distribute these tasks on machines. The only limitation is that each machine must run only one task at a time. I am trying to maximize throughput because jobs are not very balanced. My current cracked shell scripts are less efficient because I pre-create a queue for each machine and cannot move jobs from the queue of a heavily loaded machine to the waiting one, having already finished everything.
Previous suggestions included SLURM, which seems redundant, and further overloads LoadLeveller.
GNU Parallel looks almost exactly what I want, but the remote machines do not say SSH; it uses a custom launcher (which does not have capabilities for queues). I would like to have Gnu Parallel, where the machine can simply be replaced in the shell script on the fly before submitting the job.
So in short:
- List of jobs + List of machines that can accept: Maximize throughput. Preferably as close to the shell as possible.
In the worst case, something can be hacked along with the bash lockfile, but I feel that there must be a better solution somewhere.
source share