How can I cleanly include a nested Perl hash in a non-nested?

Suppose the nested hash structure is %old_hash ..

 my %old_hash; $old_hash{"foo"}{"bar"}{"zonk"} = "hello"; 

.. which we want to “smooth out” (sorry if this is the wrong terminology!) to a non-nested hash using sub &flatten(...) to ..

 my %h = &flatten(\%old_hash); die unless($h{"zonk"} eq "hello"); 

The following definition of &flatten(...) does the trick:

 sub flatten { my $hashref = shift; my %hash; my %i = %{$hashref}; foreach my $ii (keys(%i)) { my %j = %{$i{$ii}}; foreach my $jj (keys(%j)) { my %k = %{$j{$jj}}; foreach my $kk (keys(%k)) { my $value = $k{$kk}; $hash{$kk} = $value; } } } return %hash; } 

While this code works, it is not very readable or clean.

My question is double:

  • How does this code not meet modern Perl best practices? Be severe !:-)
  • How would you clean it?
+7
perl
source share
4 answers

Your method is not the best because it does not scale. What if the nested hash is six, ten levels? The repetition should tell you that a recursive procedure is probably what you need.

 sub flatten { my ($in, $out) = @_; for my $key (keys %$in) { my $value = $in->{$key}; if ( defined $value && ref $value eq 'HASH' ) { flatten($value, $out); } else { $out->{$key} = $value; } } } 

Alternatively, a good modern Perl style is to use CPAN where possible. Data :: Traverse will do what you need:

 use Data::Traverse; sub flatten { my %hash = @_; my %flattened; traverse { $flattened{$a} = $b } \%hash; return %flattened; } 

As a final note, it is usually more efficient to pass hashes by reference in order to avoid expanding them into lists, and then turn them into hashes again.

+10
source share

First, I would use perl -c to make sure that it compiles, but it doesn’t. So, I would add a final } to compile it.

Then I ran it through perltidy to improve the code layout (indentation, etc.).

Then I ran perlcritic (in "severe" mode) to automatically tell me what it considers to be bad. He complains that:

The routine does not end with a “return”

Update: The OP essentially changed every line of code after I posted my answer above, but I believe that it still applies. This is not easy shooting at a moving target :)

+3
source share

There are several issues with your approach that you need to find out. First, what happens if there are two leaf nodes with the same key? The second entered the first, the second is ignored, should the output contain a list of them? Here is one approach. First, we build a flat list of key value pairs using a recursive function to process other hash depths:

 my %data = ( foo => {bar => {baz => 'hello'}}, fizz => {buzz => {bing => 'world'}}, fad => {bad => {baz => 'clobber'}}, ); sub flatten { my $hash = shift; map { my $value = $$hash{$_}; ref $value eq 'HASH' ? flatten($value) : ($_ => $value) } keys %$hash } print join( ", " => flatten \%data), "\n"; # baz, clobber, bing, world, baz, hello my %flat = flatten \%data; print join( ", " => %flat ), "\n"; # baz, hello, bing, world # lost (baz => clobber) 

The fix may be something like this, which will create a hash of the refs array containing all the values:

 sub merge { my %out; while (@_) { my ($key, $value) = splice @_, 0, 2; push @{ $out{$key} }, $value } %out } my %better_flat = merge flatten \%data; 

In production code, it would be faster to pass links between functions, but I omitted this for clarity.

+2
source share

Is it your intention to end up with a copy of the original hash, or just a reordered result?

Your code starts with one hash (the original hash that is used by reference) and makes two copies of %i and %hash .

The approval my %i=%{hashref} not required. You copy the entire hash to the new hash. In any case (whether you want to get a copy no) you can use links to the original hash.

You also lose data if your hash in the hash has the same meaning as the parent hash. Is this intended?

+1
source share

All Articles