Perl Thread Collection

This question is a matter of curiosity about how one of the two programs works.

I use Image :: Magick to resize photos. To save a little time, I work on each photo in my thread and use a semaphore to limit the number of threads working simultaneously. Initially, I allowed each thread to start immediately, but the script quickly allocated 3.5 GB for all photos (I only have 2 GB), and the script will work 5 times slower than usual due to the fact that all swapping to disk.

The semaphore working code looks something like this:

use threads; use Thread::Semaphore; use Image::Magick; my $s = Thread::Semaphore->new(4); foreach ( @photos ) { threads->create( \&launch_thread, $s ); } foreach my $thr ( reverse threads->list() ) { $thr->join(); } sub launch_thread { my $s = shift; $s->down(); my $image = Image::Magick->new(); # do memory-heavy work here $s->up(); } 

It quickly allocates 500 MB and works pretty well without requiring more. (The threads join in the reverse order to make a point.)

I wondered if there could be overhead from running 80 threads at the same time and blocking most of them, so I changed my script to block the main thread:

 my $s = Thread::Semaphore->new(4); foreach ( @photos ) { $s->down(); threads->create( \&launch_thread, $s ); } foreach my $thr ( threads->list() ) { $thr->join(); } sub launch_thread { my $s = shift; my $image = Image::Magick->new(); # do memory-heavy work here $s->up(); } 

This version starts normally, but gradually accumulates 3.5 GB of space used in the original version. This is faster than starting all threads at once, but still rather slow than blocking threads.

My first assumption was that the memory used by the thread is not freed until join () is called on it, and since this is the main thread that blocks, the threads are not freed until they are all highlighted. However, in the first working version, the streams transmit protection in a more or less random order, but are combined in the reverse order. If my guess is correct, then many of them, except for four running threads, must wait to be merged () ed at any time, and this version should also be slower.

So why are these two versions different from each other?

+7
source share
1 answer

You do not need to create more than 4 threads. One of the main advantages is that it means 76 fewer copies of the Perl interpreter. In addition, this makes the harvest order rather controversial, since all threads end at more or less the same time.

 use threads; use Thread::Queue qw( ); use Image::Magick qw( ); use constant NUM_WORKERS => 4; sub process { my ($photo) = @_; ... } { my $request_q = Thread::Queue->new(); my @threads; for (1..NUM_WORKERS) { push @threads, async { while (my $photo = $request_q->dequeue()) { process($photo); } }; } $request_q->enqueue($_) for @photos; $request_q->enqueue(undef) for 1..NUM_THREADS; $_->join() for @threads; } 
+3
source

All Articles