Does iterate over a hash link by implicitly copying it to perl?

Let's say I have a big hash and I want to iterate over the contents of the content. The standard idiom would be something like this:

while(($key, $value) = each(%{$hash_ref})){ ///do something } 

However, if I understand my perl correctly, this actually does two things. First

 %{$hash_ref} 

translates ref into a list context. So returning something like

 (key1, value1, key2, value2, key3, value3 etc) 

which will be stored in my stack memory. Then each method will be launched, which will contain the first two values ​​in memory (key1 and value1) and return them to my while loop for processing.

If my understanding of this is correct, it means that I actually copied my entire hash into my memory stack only to repeat a new copy, which can be expensive for a large hash due to the expense of iterating through the array twice, but also because of potential cache- hits if both hashes cannot be held immediately in memory. It seems rather inefficient. I am wondering if this is true, or if I either misunderstand the actual behavior or the compiler optimizes inefficiency for me?

Subsequent questions, suggesting that I am correct about standard behavior.

  • Is there any syntax to avoid copying the hash by iterating its values ​​in the original hash? If not for a hash, is there one for a simpler array?

  • Does this mean that in the above example, I could get inconsistent values ​​between a copy of my hash and my actual hash if I modify the contents of hash_ref in my loop; whereby the value of $ has a different meaning, then $ hash_ref → ($ key)?

+6
source share
2 answers

No copy is created by each (although you will copy the return values ​​to $key and $value through the destination). The hash itself is passed to each .

each little special. It supports the following syntaxes:

 each HASH each ARRAY 

As you can see, it does not accept arbitrary expression. (This will be each EXPR or each LIST ). The reason for this is to allow each(%foo) to pass the %foo hash to itself each , rather than evaluating it in the context of the list. each can do this because it is an operator, and operators can have their own parsing rules. However, you can do something similar with the \% prototype.

 use Data::Dumper; sub f { print(Dumper(@_)); } sub g(\%) { print(Dumper(@_)); } # Similar to each my %h = (a=>1, b=>2); f(%h); # Evaluates %h in list context. print("\n"); g(%h); # Passes a reference to %h. 

Output:

 $VAR1 = 'a'; # 4 args, the keys and values of the hash $VAR2 = 1; $VAR3 = 'b'; $VAR4 = 2; $VAR1 = { # 1 arg, a reference to the hash 'a' => 1, 'b' => 2 }; 

%{$h_ref} same as %h , so all of the above also applies to %{$h_ref} .


Note that the hash is not copied, even if it is flattened. The keys are "copied", but the values ​​are returned directly.

 use Data::Dumper; my %h = (abc=>"def", ghi=>"jkl"); print(Dumper(\%h)); $_ = uc($_) for %h; print(Dumper(\%h)); 

Output:

 $VAR1 = { 'abc' => 'def', 'ghi' => 'jkl' }; $VAR1 = { 'abc' => 'DEF', 'ghi' => 'JKL' }; 

Read more about this here .

+1
source

No, the syntax you quote does not create a copy.

This expression:

 %{$hash_ref} 

exactly equivalent to:

 %$hash_ref 

and assuming that the $hash_ref scalar variable $hash_ref indeed contain a hash reference, adding % at the front is just “dereferencing” the link — that is, it resolves the value representing the main hash (the thing $hash_ref ).

If you look at the documentation for the each function, you will see that it expects a hash as an argument. Putting % in the foreground is how you provide the hash when you have hashref.

If you wrote your own routine and passed the hash to it like this:

 my_sub(%$hash_ref); 

then at some level you could say that the hash was "copied", since inside the subprogram, the special @_ array will contain a list of all key / value pairs from the hash. However, even in this case, the @_ elements are actually aliases for keys and values. You would really get a copy if you did something like: my @args = @_ .

The Perl builtin each function is declared with the prototype '+', which efficiently uses the hash (or array) argument in reference to the underlying data structure.

As an aside, starting with version 5.14, each function can also refer to a hash. Therefore, instead of:

 ($key, $value) = each(%{$hash_ref}) 

You can simply say:

 ($key, $value) = each($hash_ref) 
+5
source

All Articles