Is there an easy way to extract deeply nested values ​​using XML :: Simple?

I use Perl XML :: Simple to parse deeply embedded XML and would like to extract a small list of elements from 4 levels down

A B C D1 D2 D3 

Ideally, I want to do this in the input step, if possible. Like this:

 my @list = XMLin($xml, { SomeAttribute => 'ButWhat?' }); 

ends with the same as me:

 @list = ('D1', 'D2', 'D3') 

Is it possible to? Or just not that simple?

+4
source share
5 answers

Thanks for all the suggestions.

In the end, I avoided the problem of traversing the data structure using the eval block.

 my $xml_tree; my @list; eval { # just go for it my @list = @{ $xml_tree->{A}->{B}->{C} }; }; if ( $@ ) { say "oops - xml is not in expected format - and that happens sometimes"; } 
0
source

Assuming your data in memory looks like this:

 my $parsed = { A => { B => { C => [ qw/here is your list/ ], }, }, }; 

You can then get your list with my @list = @{ $parsed->{A}{B}{C} } .

Is this what you are trying to do?

Edit: based on some comments, maybe you want Data :: Visitor :: Callback . Then you can extract all arrays, for example:

 my @arrays; my $v = Data::Visitor::Callback->new( array => sub { push @arrays, $_ }, ); $v->visit( $parsed_xml ); 

After that, \ @arrays will be a list of links to arbitrary deeply nested arrays.

Finally, if you just have an attribute name and want to match XML nodes, you really want XPath:

 use XML::LibXML; my $parser = XML::LibXML->new; my $doc = $parser->parse_string( $xml_string ); # yeah, I am naming the variable data. so there. my @data = map { $_->textContent } $doc->findnodes('//p[@id="foo"]'); 

Anyway, TMTOWTDI. If you are working with XML and want to do something complex, XML :: Simple is rarely the right answer. I use XML :: LibXML for everything, since it is almost always simpler.

One more thing you might want Data :: DPath . This allows you to "XPath" in the perl data structure in memory:

+3
source

Based on Jon's answer , here is the basic code that I use when I need to do such things. If I need something more interesting, I usually reach the module if I am allowed to do this.

The trick in get_values begins with a top level link, gets the next lower level and puts it in the same variable. He keeps walking until I get to where I want. Most of the code is simply statements that guarantee that everything will be correct. In most cases, I find data that is messed up rather than a workaround (but I do a lot of work to clear the data). Adjust error checking for your situation.

  use Carp qw (croak);

 my $ parsed = {
   A => {
     B => {
       C => [qw / here is your list /],
       D => {
         E => [qw / this is a deeper list /],
         },
     },
   },
 };

 my @keys = qw (ABCD);

 my @values ​​= eval {get_values ​​($ parsed, @keys)} or die;

 $ "="] [";
 print "Values ​​are [@values] \ n";

 sub get_values
     {
     my ($ hash, @keys) = @_;

     my $ v = $ hash;  # starting reference

     foreach my $ key (@keys)
         {
         croak "Value is not a hash ref [at $ key!] \ n" unless ref $ v eq ref {};
         croak "Key $ key does not exist! \ n" unless exists $ v -> {$ key};
         $ v = $ v -> {$ key};  # replace with ref down one level
         }

     croak "Value is not an array ref!"  unless ref $ v eq ref [];
     @ $ v;
     }
+1
source

The fact that you are using XML :: Simple does not matter; You are trying to find a structure with ref ref and array refs. Do you know what exactly you are looking for? Will it always be in one place? If so, then something like what jrockway wrote would do the trick easily. If not, then you need to go through each part of the structure until you find what you are looking for.

One thing I often do is reset the structure that returns XML :: Simple with Data :: Dumper to see what it looks like if it always looks, and if not, you can dynamically determine how to get it through testing, this is something like ref and what kind of link it is). The real question is: what are you looking for?

0
source

Data :: Diver provides a nice interface for digging in deep structures.

0
source

All Articles