Linux bash multi-threaded / processing small jobs

I have a script that executes some data processing command 10K times.

foreach f (folderName/input*.txt) mycmd $f end 

I calculated the runtime for each "mycmd $ f" as 0.25 s. With 10K running, this is over 1 hour. I run it on 16 nehalem cores. This is a huge waste, so as not to work on the remaining 15 cores.

I tried and slept since the script just dies with a warning or error of about 3900 iterations, see below. The shorter the dream, the faster it dies.

 foreach f (folderName/input*.txt) mycmd $f & ; sleep 0.1 end 

There must be a better way. Note. I would prefer a shell script solution, don't get confused by the ground of C / C ++.

thanks

Hi

+4
source share
4 answers

Link the file list to

 xargs -n 1 -P 16 mycmd 

For instance:

 echo folderName/input*.txt | xargs -n 1 -P 16 mycmd 
+5
source

There are several other solutions that can be used in one of the following applications:

xjobs

Parallel

PPSS - Script Parallel Processing Shell

runpar.sh

+1
source

Submit tasks using batch ; which should fix load balancing and hunger problems in resources.

 for f in folderName/input.*; do batch <<____HERE mycmd "$f" ____HERE done 

(Not 100% sure that quotation marks are correct and / or useful.)

0
source

Using GNU Parallel, you can:

 parallel mycmd ::: folderName/input*.txt 

From: http://git.savannah.gnu.org/cgit/parallel.git/tree/README

= Full installation =

A complete installation of GNU Parallel is simple:

 ./configure && make && make install 

If you are not a root user, you can add ~ / bin to your path and set it to ~ / bin and ~ / share:

 ./configure --prefix=$HOME && make && make install 

Or, if your system lacks β€œmake,” you can simply copy src / parallel src / sem src / niceload src / sql to a directory in your path.

= Minimum installation =

If you just need a parallel and you don’t have β€œmake” (the system may be outdated or Microsoft Windows):

 wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel chmod 755 parallel cp parallel sem mv parallel sem dir-in-your-$PATH/bin/ 

Watch the video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

0
source

Source: https://habr.com/ru/post/1415672/


All Articles