How does XPath handle XML namespaces?

How does XPath handle XML namespaces?

If i use

/IntuitResponse/QueryResponse/Bill/Id 

to parse the XML document below, I get 0 nodes back.

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <IntuitResponse xmlns="http://schema.intuit.com/finance/v3" time="2016-10-14T10:48:39.109-07:00"> <QueryResponse startPosition="1" maxResults="79" totalCount="79"> <Bill domain="QBO" sparse="false"> <Id>=1</Id> </Bill> </QueryResponse> </IntuitResponse> 

However, I do not indicate that the namespace in XPath (i.e. http://schema.intuit.com/finance/v3 not a prefix of each path token). How can XPath know which Id I want if I don't say it explicitly? I assume that in this case (since there is only one namespace) XPath can go away completely ignoring xmlns . But if there are multiple namespaces, things can get ugly.

+18
xml xpath xml-namespaces
Nov 25 '16 at 0:43
source share
1 answer

Defining namespaces in XPath (recommended)

XPath itself has no way to associate a namespace prefix with a namespace. Such features are provided by the hosting library.

It is recommended that you use these tools and define namespace prefixes that you can then use to qualify XML element names and attributes if necessary.




Here are some of the different mechanisms that XPath hosts provide for binding namespace prefixes to namespace URIs:

XSLT:

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:i="http://schema.intuit.com/finance/v3"> ... 

Perl ( LibXML ):

 my $xc = XML::LibXML::XPathContext->new($doc); $xc->registerNs('i', 'http://schema.intuit.com/finance/v3'); my @nodes = $xc->findnodes('/i:IntuitResponse/i:QueryResponse'); 

Python ( lxml ):

 from lxml import etree f = StringIO('<IntuitResponse>...</IntuitResponse>') doc = etree.parse(f) r = doc.xpath('/i:IntuitResponse/i:QueryResponse', namespaces={'i':'http://schema.intuit.com/finance/v3'}) 

Python ( ElementTree ):

 namespaces = {'i': 'http://schema.intuit.com/finance/v3'} root.findall('/i:IntuitResponse/i:QueryResponse', namespaces) 

Java (SAX):

 NamespaceSupport support = new NamespaceSupport(); support.pushContext(); support.declarePrefix("i", "http://schema.intuit.com/finance/v3"); 

Java (XPath):

 xpath.setNamespaceContext(new NamespaceContext() { public String getNamespaceURI(String prefix) { switch (prefix) { case "i": return "http://schema.intuit.com/finance/v3"; // ... } }); 

XMLStarlet:

 -N i="http://schema.intuit.com/finance/v3" 

JavaScript:

See the implementation of the User-Defined Namespace Qualifier :

 function nsResolver(prefix) { var ns = { 'i' : 'http://schema.intuit.com/finance/v3' }; return ns[prefix] || null; } document.evaluate( '/i:IntuitResponse/i:QueryResponse', document, nsResolver, XPathResult.ANY_TYPE, null ); 

PhP:

Adapted from @Tomalak answer using DOMDocument :

 $result = new DOMDocument(); $result->loadXML($xml); $xpath = new DOMXpath($result); $xpath->registerNamespace("i", "http://schema.intuit.com/finance/v3"); $result = $xpath->query("/i:IntuitResponse/i:QueryResponse"); 

See also @IMSoP canonical Q / A in the PHP SimpleXML namespaces .

FROM#:

 XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable); nsmgr.AddNamespace("i", "http://schema.intuit.com/finance/v3"); XmlNodeList nodes = el.SelectNodes(@"/i:IntuitResponse/i:QueryResponse", nsmgr); 

VBA:

 xmlNS = "xmlns:i='http://schema.intuit.com/finance/v3'" doc.setProperty "SelectionNamespaces", xmlNS Set queryResponseElement =doc.SelectSingleNode("/i:IntuitResponse/i:QueryResponse") 

VB.NET:

 xmlDoc = New XmlDocument() xmlDoc.Load("file.xml") nsmgr = New XmlNamespaceManager(New XmlNameTable()) nsmgr.AddNamespace("i", "http://schema.intuit.com/finance/v3"); nodes = xmlDoc.DocumentElement.SelectNodes("/i:IntuitResponse/i:QueryResponse", nsmgr) 

Rubin (Nokogiri):

 puts doc.xpath('/i:IntuitResponse/i:QueryResponse', 'i' => "http://schema.intuit.com/finance/v3") 

Please note that Nokogiri supports removing namespaces,

 doc.remove_namespaces! 

but see the warnings below that prevent XML namespaces from winning.




After you declare a namespace prefix, your XPath can be written to use it:

 /i:IntuitResponse/i:QueryResponse 



Defeat namespaces in XPath (not recommended)

An alternative is to write predicates that check local-name() :

 /*[local-name()='IntuitResponse']/*[local-name()='QueryResponse']/@startPosition 

Or in XPath 2.0:

 /*:IntuitResponse/*:QueryResponse/@startPosition 

Bypassing namespaces this way works, but is not recommended because it

  • Insufficiently indicates the full name of the element / attribute.
  • Unable to distinguish the names of elements / attributes in different namespaces (the very purpose of namespaces). Note that this problem can be solved by adding an additional predicate to explicitly check the namespace 1 URI:

     /*[ namespace-uri()='http://schema.intuit.com/finance/v3' and local-name()='IntuitResponse'] /*[ namespace-uri()='http://schema.intuit.com/finance/v3' and local-name()='QueryResponse'] /@startPosition 

    1 Thanks to Daniel Haley for the note on namespace-uri() .

  • This is too verbose.

+24
Nov 25 '16 at 0:58
source share



All Articles