I have an 8-core processor with 8 GB of RAM, and I create a batch file to automate the 7-zip CLI by exhausting most of the parameters and variables to compress the same set of files with the ultimate goal of finding the strongest combination of parameters and variables that result to the smallest archive size.
This is very time-consuming in nature, especially when the set of files to be processed is in gigabytes. I need a way not only for automation, but also to speed up the whole process.
7-zip works with various compression algorithms, some of them are only single-threaded, and some of them are multi-threaded, some do not require a large amount of memory, and some require huge volumes and can even exceed the 8 GB barrier. I have already successfully created an automatic batch that runs in a sequence that eliminates combinations that require more than 8 GB of memory.
I divided the various compression algorithms into several batches to simplify the whole process. For example, compression in PPMd in the form of a 7z archive uses a 1-thread and up to 1024 MB. This is my current batch:
@echo off echo mem=1m 2m 3m 4m 6m 8m 12m 16m 24m 32m 48m 64m 96m 128m 192m 256m 384m 512m 768m 1024m echo o=2 3 4 5 6 7 8 10 12 14 16 20 24 28 32 echo s=off 1m 2m 4m 8m 16m 32m 64m 128m 256m 512m 1g 2g 4g 8g 16g 32g 64g on echo x=1 3 5 7 9 for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s exit
x , s , o and mem are parameters, and which after each of them are the variables that 7z.exe will work with. x and s in this case are not of interest, they mean the compression force and the massive block size for the archive.
This batch will work fine, but is limited by the ability to run only one instance of 7z.exe at a time, and now I'm looking for a way to make it run more instances of 7z.exe in parallel, but without exceeding 8 GB of RAM or 8 threads at a time, depending on that comes sooner before proceeding with the following in sequence.
How can I improve this? I have some ideas, but I don’t know how to get them to work in the party. I was thinking of 2 other variables that will not interact with 7z processes, but will control when the next 7z instance starts. One variable will keep track of how many threads are currently in use, and another will keep track of how much memory is being used. Could this work?
Edit: Sorry, I need to add details, I'm new to this posting style. After this answer - https://stackoverflow.com/a/166148/2127- I have mentioned that 8 parties were created, and one of them is the 7z.PPMd party. Perhaps listing all the batches and how 7z deals with parameters will give a better understanding of the whole problem. I'll start with the simple ones:
- 7z.PPMd - 1 fully used memory consumption 32m-1055m depending on the stream and dictionary for each instance.
- 7z.BZip2 - 8 fully used threads and a fixed memory usage of 109 m for each instance.
- zip.Bzip2 - 8 partially used threads and a fixed memory usage of 336 m for each instance.
- zip.Deflate - 8 partially used threads and a fixed memory usage of 260 m for each instance.
- zip.PPMd - 8 partially used threads and dictionary dependent 280m-2320m for each instance.
What I mean with partially used threads is that although I assign 8 threads that will be used by each instance of 7.exe, the algorithm can arbitrarily use variable CPU usage by chance from my control, an unpredictable one, but the restriction there no more than 8 threads are installed. In the case of 8 fully utilized threads, this means that on my 8-core processor, each instance uses 100% of the processor.
The most difficult of them - 7z.LZMA, 7z.LZMA2, zip.LZMA - will need to be explained in detail, but now I quickly check. I will come back to edit the LZMA part when I have more free time.
Thanks again.
EDIT: adding to the LZMA part.
7z.LZMA - each instance of n-threaded, from 1 to 2:
- 1 fully used stream, the dictionary depends on 64k to 512m:
- The 64 kilobyte dictionary uses 32 m memory.
- ...
- 512m dictionary uses 5407m memory
- excluded range: from 768 to 1024 m (above the limit of available memory 8192 m)
- 2 partly used dictionary dependent threads from 64 to 512 m:
- The 64 kilobyte dictionary uses 38 m memory.
- ...
- 512m dictionary uses 5413m memory
- excluded range: from 768 to 1024 m (above the limit of available memory 8192 m)
7z.LZMA2 - each instance of n-threaded, from 1 to 8:
- 1 fully used stream, the dictionary depends on 64k to 512m:
- The 64 kilobyte dictionary uses 32 m memory.
- ...
- 512m dictionary uses 5407m memory
- excluded range: from 768 to 1024 m (above the limit of available memory 8192 m)
- 2 or 3 partially used vocabulary-dependent streams from 64 to 512 m:
- The 64 kilobyte dictionary uses 38 m memory.
- ...
- 512m dictionary uses 5413m memory
- excluded range: from 768 to 1024 m (above the limit of available memory 8192 m)
- 4 or 5 partially used dictionary-dependent threads, from 64 to 256 m:
- The 64 kilobyte dictionary uses 51 m memory.
- ...
- The 256-meter dictionary uses 5677 meters of memory.
- exception range: from 384 to 1024 m (above the available memory limit of 8192 m)
- 6 or 7 partially used dictionary-dependent threads, from 64 to 192 m:
- The 64 kilobyte dictionary uses 62 m memory.
- ... Vocabulary
- 192m uses 6965m memory
- exception range: from 256 to 1024 m (above the available memory limit of 8192 m)
- 8 partially used dictionary-dependent streams from 64 to 128 m:
- The 64 kilobyte dictionary uses 72 m memory.
- ...
- The 128-meter dictionary uses a memory of 6717 m.
- exception range: from 192 to 1024 m (above the available memory limit of 8192 m)
zip.LZMA - each instance has n-threaded, from 1 to 8:
- 1 fully used stream, the dictionary depends on 64k to 512m:
- The 64 kilobyte dictionary uses 3 m memory.
- ...
- 512m dictionary uses 5378m memory
- excluded range: from 768 to 1024 m (above the limit of available memory 8192 m)
- 2 or 3 partially used vocabulary-dependent streams from 64 to 512 m:
- The 64 kilobyte dictionary uses 9 m memory.
- ...
- 512 m. The dictionary uses a memory of 5384 m.
- excluded range: from 768 to 1024 m (above the limit of available memory 8192 m)
- 4 or 5 partially used dictionary-dependent threads, from 64 to 256 m:
- Vocabulary
- 64k uses 82 m memory.
- ...
- The 256 megabyte dictionary uses 5456 m memory.
- exception range: from 384 to 1024 m (above the available memory limit of 8192 m)
- 6 or 7 partially used dictionary-dependent streams from 64 to 256 m:
- The 64 kilobyte dictionary uses 123 m memory.
- ... Vocabulary
- 256m uses 8184m (very close to the limit, though, I can consider its exception)
- exception range: from 384 to 1024 m (above the available memory limit of 8192 m)
- 8 partially used dictionary-dependent streams from 64 to 128 m:
- The 64 kilobyte dictionary uses 164 m memory.
- ...
- The 128-meter dictionary uses 5536 m memory.
- exception range: from 192 to 1024 m (above the available memory limit of 8192 m)
I am trying to understand the behavior of commands with nul in them. I don’t quite understand what happens during this part, what those characters mean ^> ^ & 1 "".
2>nul del %lock%!nextProc! %= Redirect the lock handle to the lock file. The CMD process will =% %= maintain an exclusive lock on the lock file until the process ends. =% start /b "" cmd /c %lockHandle%^>"%lock%!nextProc!" 2^>^&1 !cpu%%N! !cmd! ) set "launch="
Then later, on: wait code:
) 9>>"%lock%%%N" ) 2>nul if %endCount% lss %startCount% ( 1>nul 2>nul ping /n 2 ::1 goto :wait ) 2>nul del %lock%*
EDIT 2 (29-10-2013): adding the current situation point.
After examining the trial and error, complemented by step-by-step notes of what is happening, I was able to understand the above behavior. I simplified the line with the start command:
start /b /low cmd /c !cmd!>"%lock%!nextProc!"
Although it works, I still do not understand the meaning of 1^>"filename" 2^>^&1 'command' . I know that this is due to writing text in the file name, which otherwise would have been shown to me. In this case, it will display all 7z.exe text, but will be written to the file. Until the 7z.exe instance finishes its work, nothing is written to the file, but the file already exists, but at the same time does not exist. When 7z.exe actually ends, the file ends, and this time it exists for the next part of the script.
Now I can understand the processing behavior of the proposed script, and I am supplementing it with something of my own - I am trying to implement all the parties into a “one batch, doing all this” script. In the simplified version, this is:
echo 8 threads - maxproc=1 for %%x IN (9) DO for %%t IN (8) DO for %%d IN (900k) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.bzip2.%%tt.%%dd.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=BZip2:d=%%d:mt=%%t for %%x IN (9) DO for %%t IN (8) DO for %%d IN (900k) DO 7z.exe a teste.resultado\%%xx.bzip2.%%tt.%%dd.zip .\teste.original\* -mx=%%x -mm=BZip2:d=%%d -mmt=%%t for %%x IN (9) DO for %%t IN (8) DO for %%w IN (257 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.deflate64.%%tt.%%ww.zip .\teste.original\* -mx=%%x -mm=deflate64:fb=%%w -mmt=%%t for %%x IN (9) DO for %%t IN (8) DO for %%w IN (258 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.deflate.%%tt.%%ww.zip .\teste.original\* -mx=%%x -mm=deflate:fb=%%w -mmt=%%t for %%x IN (9) DO for %%t IN (8) DO for %%d IN (256m 128m 64m 32m 16m 8m 4m 2m 1m) DO for %%w IN (16 15 14 13 12 11 10 9 8 7 6 5 4 3 2) DO 7z.exe a teste.resultado\%%xx.ppmd.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=PPMd:mem=%%d:o=%%w -mmt=%%t echo 4 threads - maxproc=2 for %%x IN (9) DO for %%t IN (4) DO for %%d IN (256m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t echo 2 threads - maxproc=4 for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t echo 1 threads - maxproc=8 for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t
In short, I want to handle all this in the most efficient way. The implementation of this decision by determining how many processes can be performed simultaneously will be a method, but then memory for each process will also be required, so that the sum of all the required memory by these processes will not exceed 8192 MB. I got this piece of work.
@echo off setlocal enableDelayedExpansion set "maxMem=8192" set "maxThreads=8" :cycle1 set "cycleCount=4" set "cycleThreads=1" set "maxProc=" set /a "maxProc=maxThreads/cycleThreads" set "cycleFor1=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO (" set "cycleFor2=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO (" set "cycleFor3=for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO (" set "cycleFor4=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO (" set "cycleCmd1=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t" set "cycleCmd2=7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t" set "cycleCmd3=7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s" set "cycleCmd4=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t" set "tempMem1=5407" set "tempMem2=5407" set "tempMem3=1055" set "tempMem4=5378" rem set "tempMem1=5407" rem set "tempMem2=5407" rem set "tempMem3=1055 799 543 415 287 223 159 127 95 79 63 55 47 43 39 37 35 34 33 32" rem set "tempMem4=5378" set "memSum=0" if not defined memRem set "memRem=!maxMem!" for /l %%N in (1 1 %cycleCount%) DO (set "tempProc%%N=") for /l %%N in (1 1 %cycleCount%) DO ( set memRem set /a "tempProc%%N=%memRem%/tempMem%%N" set /a "memSum+=tempMem%%N" set /a "memRem-=tempMem%%N" set /a "maxProc=!tempProc%%N!" call :executeCycle set /a "memRem+=tempMem%%N" set /a "memSum-=tempMem%%N" set /a "maxProc-=!tempProc%%! ) goto :fim :executeCycle set "lock=lock_%random%_" set /a "startCount=0, endCount=0" for /l %%N in (1 1 %maxProc%) DO set "endProc%%N=" set launch=1 for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO ( set "cmd=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t" if !startCount! lss %maxProc% ( set /a "startCount+=1, nextProc=startCount" ) else ( call :wait ) set cmd!nextProc!=!cmd! echo !time! - proc!nextProc!: starting !cmd! 2>nul del %lock%!nextProc! start /b /low cmd /c !cmd!>"%lock%!nextProc!" ) set "launch=" :wait for /l %%N in (1 1 %startCount%) do ( if not defined endProc%%N if exist "%lock%%%N" ( echo !time! - proc%%N: finished !cmd%%N! if defined launch ( set nextProc=%%N exit /b ) set /a "endCount+=1, endProc%%N=1" ) 9>>"%lock%%%N" ) 2>nul if %endCount% lss %startCount% ( 1>nul 2>nul ping /n 2 ::1 goto :wait ) 2>nul del %lock%* echo === echo Thats all folks! exit /b :fim pause
I am having problems with cycleFor1 and cycleCmd1 located in part :cycle1 - they should replace the for line and the first cmd variable inside :executeCycle so that it works the way I expect to. How can I do this?
Another question I have is tempMem3 . I registered all the memory needed when running the cycleCmd3 . It depends on the dictionary. tempMem3 and cycleCmd3 are connected as follows:
for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO set "tempMem3=1055 799 543 415 287 223 159 127 95 79 63 55 47 43 39 37 35 34 33 32"
So, 1024 meters will use 1055, 768 meters will use 799, etc. up to 1 m using 32. I don't know how to translate this into a script.
Any help is appreciated.