Most of the wall clock time spent unpacking a file using gunzip or gzip -d will be performed from I / O operations (reading and writing to disk). This may be even more than the time taken to decompress the data. You can take advantage of this with several gzip jobs running in the background. Since some operations are blocked during I / O, another job may be executed without waiting in a queue.
You can speed up the decompression of an entire set of files with several gunzip processes running in the background. Each of them serves a specific set of files.
You can hack something easy in BASH. Separate the file list into separate commands and use & to start it as a background job. Then wait for each completion of each job.
I would recommend that you have 2 to 2 * N tasks running at the same time. Where N is the number of cores or logical processors on your computer. Experiment to get the correct number.
You can easily hack something in BASH.
#!/bin/bash argarray=( " $@ " ) len=${#argarray[@]} #declare 4 empty array sets set1=() set2=() set3=() set4=() # enumerate over each argument passed to the script # and round robin add it to one of the above arrays i=0 while [ $i -lt $len ] do if [ $i -lt $len ]; then set1+=( "${argarray[$i]}" ) ((i++)) fi if [ $i -lt $len ]; then set2+=( "${argarray[$i]}" ) ((i++)) fi if [ $i -lt $len ]; then set3+=( "${argarray[$i]}" ) ((i++)) fi if [ $i -lt $len ]; then set4+=( "${argarray[$i]}" ) ((i++)) fi done # for each array, start a background job gzip -d ${set1[@]} & gzip -d ${set2[@]} & gzip -d ${set3[@]} & gzip -d ${set4[@]} & # wait for all jobs to finish wait
In the above example, I selected 4 files per task and started two separate tasks. You can easily expand the script to have more jobs, more files for each process, and take file names as command line parameters.
selbie
source share