How to sort an array of hash links by one of the values of the hash function?

Question

How to sort an array of hash links by one of the values of the hash function?

First, please forgive my rusty Perl. I am trying to modify Bugzilla "whine.pl" to generate error lists sorted by severity.

So it gives me an array of hash links. Each hash contains a bunch of information about a specific error (id, assignee, severity, etc.). I want to sort the array by severity. What is the best way to do this?

I came up with a couple of possibilities. One of them is to create five arrays (one for each severity level), then loop over the array and route the hash links to the corresponding severity array. After that, I can collect them and replace the original array with a sorted one.

Another way my friend came up with would be assigning severity levels (stored as text in hashes) to some nubmers and cmp them. Maybe something like this?

sub getVal { my $entry = $_[0]; %lookup = ( "critical" => 0, ... ); return $lookup(entry("bug_severity")); } @sorted = sort { getVal($a) <=> getVal($b) } @unsorted;

+6

sorting perl bugzilla

Allan anderson Oct 28 '09 at 10:05

source share

4 answers

To avoid getting getVal longer than necessary, you can use "decorate, sort, decompose". Decorate gets the information you really care about sorting:

 my @decorated = map { [ $_, getVal($_) ] } @unsorted;

Then sort the decorated list:

 my @sortedDecorate = sort { $a->[1] <=> $b->[1] } @decorated;

Then put the original information back (undecorate):

 my @sorted = map { $_->[0] } @sortedDecorate;

Or a more Perl-ish way to do this:

 @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, getVal($_) ] } @unsorted;

+6

jamessan Oct 28 '09 at 10:13

source share

You can use Schwartzian Transform :

 my @sorted = map { $_->[1] } sort { $a->[0] <=> $b->[0] } map { [ $lookup{$_->{bug_severity}, $_ ] } @unsorted;

Explanation:

 map { [ $lookup{$_->{bug_severity}, $_ ] } @unsorted;

maps each error to an array reference, whose first element is a numerical error in the lookup table. Using the Shwartzian Transform, you look at the value only once for each error in @unsorted .

Then

 sort { $a->[0] <=> $b->[0] }

sorts this array by first element. Finally,

 @sorted = map { $_->[1] }

extracts the original errors from the array returned by sort .

There is no need for getval when all it does is hash lookup.

To automatically create efficient sorters, the CPAN Sort :: Maker module is excellent:

 use strict; use warnings; use Sort::Maker; my @bugs = ( { name => 'bar', bug_severity => 'severe' }, { name => 'baz', bug_severity => 'noncritical' }, { name => 'foo', bug_severity => 'critical' }, ); my $sorter = make_sorter('ST', name => 'severity_sorter', init_code => 'my %lookup = ( critical => 0, severe => 1, noncritical => -1 );', number => [ code => '$lookup{$_->{bug_severity}}' ], ); use Data::Dumper; print Dumper $_ for severity_sorter( @bugs );

Output:

  $ VAR1 = {
           'name' => 'baz',
           'bug_severity' => 'noncritical'
         };
 $ VAR1 = {
           'name' => 'foo',
           'bug_severity' => 'critical'
         };
 $ VAR1 = {
           'name' => 'bar',
           'bug_severity' => 'severe'
         };

Please note that the number of queries that must be completed when using the naive method depends on the number of elements in @unsorted . We can count them with a simple program:

 #!/usr/bin/perl use strict; use warnings; my ($n_elements) = @ARGV; my @keys = qw(abc); my %lookup = map { $keys[$_-1] => $_ } 1 .. @keys; my @unsorted = map { $keys[rand 3] } 1 .. $n_elements; my $n_lookups; my @sorted = sort { $n_lookups += 2; $lookup{$a} <=> $lookup{$b} } @unsorted; print "It took $n_lookups lookups to sort $n_elements elements\n";

Output:

  C: \ Temp> tzt 10
 It took 38 lookups to sort 10 elements

 C: \ Temp> tzt 100
 It took 978 lookups to sort 100 elements

 C: \ Temp> tzt 1000
 It took 10916 lookups to sort 1000 elements

 C: \ Temp> tzt 10000
 It took 113000 lookups to sort 10000 elements

Therefore, more information would be needed to decide whether naive sorting or the use of the Schwartz transform is a suitable solution.

And here is a simple test that seems to agree with the @Ether argument:

 #!/usr/bin/perl use strict; use warnings; use Benchmark qw( cmpthese ); my ($n_elements) = @ARGV; my @keys = qw(foo bar baz); my %lookup = map { $keys[$_] => $_ } 0 .. $#keys; my @unsorted = map { {v => $keys[rand 3]} } 1 .. $n_elements; cmpthese(-1, { naive => sub { my @sorted = sort { $lookup{$a->{v}} <=> $lookup{$b->{v}} } @unsorted; }, schwartzian => sub { my @sorted = map { $_->[1] } sort { $a->[0] <=> $b->[0] } map { [$lookup{$_->{v}}, $_] } @unsorted; } });

Output:

  C: \ Temp> tzt 10
                Rate schwartzian naive
 schwartzian 18842 / s - -29%
 naive 26357 / s 40% -

 C: \ Temp> tzt 100
               Rate naive schwartzian
 naive 1365 / s - -11%
 schwartzian 1532 / s 12% -

 C: \ Temp> tzt 1000
              Rate naive schwartzian
 naive 121 / s - -11%
 schwartzian 135 / s 12% -

+4

Sinan Ünür Oct 28 '09 at 10:15

source share

You can use the lookup table to determine the severity of bugzilla, for example (using sample data to illustrate):

 use strict; use warnings; use Data::Dumper; my @bugInfo = ( { id => 1, assignee => 'Bob', severity => 'HIGH' }, { id => 2, assignee => 'Anna', severity => 'LOW' }, { id => 3, assignee => 'Carl', severity => 'EXTREME' }, ); my %severity_ordering = ( EXTREME => 0, HIGH => 1, MEDIUM => 2, LOW => 3, ); sub byseverity { $severity_ordering{$a->{severity}} <=> $severity_ordering{$b->{severity}} } my @sortedBugs = sort byseverity @bugInfo; print Dumper(\@sortedBugs);

gives:

 $VAR1 = [ { 'assignee' => 'Carl', 'id' => 3, 'severity' => 'EXTREME' }, { 'assignee' => 'Bob', 'id' => 1, 'severity' => 'HIGH' }, { 'assignee' => 'Anna', 'id' => 2, 'severity' => 'LOW' } ];

0

Ether Oct 28 '09 at 10:25

source share

tster · Accepted Answer · 2009-10-28T22:09:57+0000

I like your proposed solution:

 my %sevs = (critical => 0, high => 1, ...); my @sorted = sort { $sevs{$a->{bug_severity}} <=> $sevs{$b->{bug_severity}} } @unsorted

How to sort an array of hash links by one of the values ​​of the hash function?

More articles:

How to sort an array of hash links by one of the values of the hash function?