I often find that I write simple loops to perform operations on many files, for example:
for i in `find . | grep ".xml$"`; do bzip2 $i; done
It seems a little depressing that my 4-core machine uses only one core .. is there an easy way to add parallelism to my shell script?
EDIT: To present a little more context for my problems, I'm sorry I was not more clear to start with!
I often want to run simple (ish) scripts, for example, plot, compress or decompress, or run some program on data sets of a reasonable size (usually from 100 to 10,000). The scripts that I use to solve such problems look like above, but may have a different command or even a sequence of commands to execute.
For example, just now I run:
for i in `find . | grep ".xml.bz2$"`; do find_graph -build_graph $i.graph $i; done
So, my problems are by no means bzip specific! (Although parallel bzip looks cool, I intend to use it in the future).
bash parallel-processing
Chris jefferson
source share