ImageMagick: how to achieve low memory usage when resizing a large number of image files?

I would like to resize a large number (about 5200) of image files (PPM format, each 5 MB in size) and save them in PNG format using convert .

Short version:

convert explodes 24 GB of memory, although I use a syntax that tells convert process image files sequentially.

Long version:

For more than 25 GB of image data, I suggest that I should not process all the files at once. I was looking for ImageMagick documentation on how to process image files sequentially and I found :

It is faster and less resource intensive to resize each image. in the following way:

$ convert '*.jpg[120x120]' thumbnail%03d.png

In addition, the manual states :

For example, instead of ...

montage '*.tiff' -geometry 100x100+5+5 -frame 4 index.jpg

which first reads all the tiff files and then resizes them. You can instead ...

montage '*.tiff[100x100]' -geometry 100x100+5+5 -frame 4 index.jpg

This will read each image and resize it before proceeding to the next image. As a result, there is significantly less memory usage and, possibly, prevent disk replacement (interception) when the memory limits are reached.

Therefore, this is what I am doing:

 $ convert '*.ppm[1280x1280]' pngs/%05d.png 

In accordance with the documents, he must process each image file one after another: read, resize, write. I do this on a machine with 12 real cores and 24 GB of RAM. However, in the first two minutes, memory usage in the convert process increases to about 96%. He stays there. CPU utilization is maximized. A little longer, and the process dies, simply saying:

Killed

No output files were created at this point. I'm on Ubuntu 10.04 and convert --version says:

 Version: ImageMagick 6.5.7-8 2012-08-17 Q16 http://www.imagemagick.org Copyright: Copyright (C) 1999-2009 ImageMagick Studio LLC Features: OpenMP 

It looks like convert trying to read all the data before starting the conversion. Therefore, either there is an error in convert , a problem with the documentation, or I did not read the documentation properly.

What's wrong? How can I achieve low memory usage when resizing this large number of image files?

BTW: A quick solution would be to simply iterate over the files using the shell and invoke convert for each file independently. But I would like to understand how to achieve this using pure ImageMagick.

Thanks!

+8
linux image image-processing imagemagick imagemagick-convert
source share
3 answers

Without direct access to your system, it is very difficult to help you debug this.

But you can do three things to help narrow down this problem:

  • Add -monitor as the first command line argument to learn more about what is happening.

  • (optional) add -debug all -log "domain: %d +++ event: %e +++ function: %f +++ line: %l +++ module: %m +++ processID: %p +++ realCPUtime: %r +++ wallclocktime: %t +++ userCPUtime: %u \n\r"

  • Temporarily do not use '* .ppm [1280x1280]' as an argument, but use 'a * .ppm [1280x1280]' instead. The goal is to limit the expansion of the wildcard (or some other suitable way to achieve this) to multiple matches instead of all possible matches.

If you do "2.", you will need to do "3." otherwise you will be stunned by the mass of exit. (Also your system does not seem to be able to process the complete template anyway without killing the process ...)

If you did not find a solution, then ...

  • ... register your username the official ImageMagick bug report forum .
  • ... report your problem to see if they can help you (these guys are pretty friendly and helpful if you ask politely).
+5
source share

It turns out the same problem, it seems, because ImageMagick creates temporary files in the / tmp directory , which is often mounted as tmpfs.

Just move your tmp to another location.

For example:

  • create the "tmp" directory on a large external drive

    mkdir -m777 /media/huge_device/tmp

  • make sure permissions are set to 777

    chmod 777 /media/huge_device/tmp

  • as root, mount it instead of your / tmp

    mount -o bind /media/huge_device/tmp /tmp

Note. You can use the TMP environment variable to use the same trick.

+2
source share

I would go with GNU Parallel if you have 12 cores - something like this that works very well. Since it only performs 12 images at a time, while preserving the numbering of the output files, it uses only minimal RAM.

 scene=0 for f in *.ppm; do echo "$f" $scene ((scene++)) done | parallel -j 12 --colsep ' ' --eta convert {1}[1280x1280] -scene {2} pngs/%05d.png 

Notes

-scene allows -scene to set the scene counter that appears in your %05d part.

--eta predicts when your work will be completed (estimated time of arrival).

-j 12 runs 12 jobs at the same time.

0
source share

All Articles