I am trying to select a node using an XPath query, and I do not understand why XML :: LibXML does not find the node when it has the xmlns attribute. Here's a script to demonstrate the problem:
#!/usr/bin/perl use XML::LibXML; # 1.70 on libxml2 from libxml2-dev 2.6.16-7sarge1 (don't ask) use XML::XPath; # 1.13 use strict; use warnings; use v5.8.4; # don't ask my ($xpath, $libxml, $use_namespace) = @ARGV; my $xml = sprintf(<<'END_XML', ($use_namespace ? 'xmlns="http://www.w3.org/2000/xmlns/"' : q{})); <?xml version="1.0" encoding="iso-8859-1"?> <RootElement> <MyContainer %s> <MyField> <Name>ID</Name> <Value>12345</Value> </MyField> <MyField> <Name>Name</Name> <Value>Ben</Value> </MyField> </MyContainer> </RootElement> END_XML my $xml_parser = $libxml ? XML::LibXML->load_xml(string => $xml, keep_blanks => 1) : XML::XPath->new(xml => $xml); my $nodecount = 0; foreach my $node ($xml_parser->findnodes($xpath)) { $nodecount ++; print "--NODE $nodecount--\n"; #would use say on newer perl print $node->toString($libxml && 1), "\n"; } unless ($nodecount) { print "NO NODES FOUND\n"; }
This script allows you to choose between the XML :: LibXML parser and the XML :: XPath parser. It also allows you to define the xmlns attribute in the MyContainer element or leave it depending on the arguments passed.
I use the xpath expression "RootElement / MyContainer". When I run a query using XML :: LibXML parsing without a namespace, it detects a node without problems:
benb@enkidu:~$ ROC/ECG/libxml_xpath.pl 'RootElement/MyContainer' libxml --NODE 1-- <MyContainer> <MyField> <Name>ID</Name> <Value>12345</Value> </MyField> <MyField> <Name>Name</Name> <Value>Ben</Value> </MyField> </MyContainer>
However, when I run it with the namespace in place, it does not find the nodes:
benb@enkidu:~$ ROC/ECG/libxml_xpath.pl 'RootElement/MyContainer' libxml use_namespace NO NODES FOUND
Contrast this with the output when using the XMLL :: XPath parser:
benb@enkidu:~$ ROC/ECG/libxml_xpath.pl 'RootElement/MyContainer' 0 # no namespace --NODE 1-- <MyContainer> <MyField> <Name>ID</Name> <Value>12345</Value> </MyField> <MyField> <Name>Name</Name> <Value>Ben</Value> </MyField> </MyContainer> benb@enkidu:~$ ROC/ECG/libxml_xpath.pl 'RootElement/MyContainer' 0 1 # with namespace --NODE 1-- <MyContainer xmlns="http://www.w3.org/2000/xmlns/"> <MyField> <Name>ID</Name> <Value>12345</Value> </MyField> <MyField> <Name>Name</Name> <Value>Ben</Value> </MyField> </MyContainer>
Which of these parser implementations does this โcorrectlyโ? Why does XML :: LibXML treat it differently when I use the namespace? What can I do to get node when the namespace is in place?
xml perl xpath libxml2
benrifkah
source share