How to use xpath for nodes with a prefix but without a namespace?

I have an XML file that I need to parse. I can not control the file format and can not change it.

The file uses a prefix (name it a), but it does not define a namespace for this prefix anywhere. I cannot use xpathnodes with a namespace to query a.

Here is the content of the xml document

<?xml version="1.0" encoding="UTF-8"?>

<a:root>
  <a:thing>stuff0</a:thing>
  <a:thing>stuff1</a:thing>
  <a:thing>stuff2</a:thing>
  <a:thing>stuff3</a:thing>
  <a:thing>stuff4</a:thing>
  <a:thing>stuff5</a:thing>
  <a:thing>stuff6</a:thing>
  <a:thing>stuff7</a:thing>
  <a:thing>stuff8</a:thing>
  <a:thing>stuff9</a:thing>
</a:root>

I use Nokogiri to request a document:

doc = Nokogiri::XML(open('text.xml'))
things = doc.xpath('//a:thing')

The following error failed:

Nokogiri::XML::XPath::SyntaxError: Undefined namespace prefix: //a:thing

From my research, I learned that I can specify a namespace for a prefix in a method xpath:

things = doc.xpath('//a:thing', a: 'nobody knows')

Returns an empty array.

What would be the best way to get the nodes I need?

+4
1

, XML. Nokogiri node "a: root" "a", , "root" - node:

xml = %Q{
    <?xml version="1.0" encoding="UTF-8"?>
    <a:root>
      <a:thing>stuff0</a:thing>
      <a:thing>stuff1</a:thing>
    </a:root>
}
doc = Nokogiri::XML(xml)
puts doc.at_xpath('*').node_name
#=> "a:root"
puts doc.at_xpath('*').namespace
#=> ""

1 - node

"a: ". //a:thing, XPath "a" . , //*[name()="a:thing"]:

xml = %Q{
    <?xml version="1.0" encoding="UTF-8"?>
    <a:root>
      <a:thing>stuff0</a:thing>
      <a:thing>stuff1</a:thing>
    </a:root>
}
doc = Nokogiri::XML(xml)
things = doc.xpath('//*[name()="a:thing"]')
puts things
#=> <a:thing>stuff0</a:thing>
#=> <a:thing>stuff1</a:thing>

2 - XML-

XML , , . , :

xml = %Q{
    <?xml version="1.0" encoding="UTF-8"?>
    <a:root>
      <a:thing>stuff0</a:thing>
      <a:thing>stuff1</a:thing>
    </a:root>
}
xml.gsub!('<a:root>', '<a:root xmlns:a="foo">')
doc = Nokogiri::XML(xml)
things = doc.xpath('//a:thing')
puts things
#=> <a:thing>stuff0</a:thing>
#=> <a:thing>stuff1</a:thing>
+3

All Articles