Can Nokogiri search for tags? Xml-stylesheet?

Question

Can Nokogiri search for tags? Xml-stylesheet?

I need to parse an XML stylesheet:

<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="/templates/xslt/inspections/disclaimer_en.xsl"?>

Using Nokogiri, I tried:

 doc.search("?xml-stylesheet").first['href']

but I get the error:

 `on_error': unexpected '?' after '' (Nokogiri::CSS::SyntaxError)

+4

ruby xml nokogiri

John Aug 22 '10 at 16:21

source share

2 answers

Daniel O'Hara · Answer 1 · 2010-08-22T18:05:15+0000

Nokogiri cannot search for tags that are XML processing instructions. You can access them as follows:

 doc.children[0]

Phrogz · Answer 2 · 2012-08-31T22:50:45+0000

This is not an XML element; this is XML "Processing Instruction" . That is why you could not find it at your request. To find it, you want:

 # Find the first xml-stylesheet PI xss = doc.at_xpath('//processing-instruction("xml-stylesheet")') # Find every xml-stylesheet PI xsss = doc.xpath('//processing-instruction("xml-stylesheet")')

In action:

 require 'nokogiri' xml = <<ENDXML <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="/templates/disclaimer_en.xsl"?> <root>Hi Mom!</root> ENDXML doc = Nokogiri.XML(xml) xss = doc.at_xpath('//processing-instruction("xml-stylesheet")') puts xss.name #=> xml-stylesheet puts xss.content #=> type="text/xsl" href="/templates/disclaimer_en.xsl"

Since the Processing Instruction is not an element, it has no attributes; you cannot, for example, request xss['type'] or xss['href'] ; you will need to parse the contents as an element if you want to. One way to do this:

 class Nokogiri::XML::ProcessingInstruction def to_element document.parse("<#{name} #{content}/>") end end p xss.to_element['href'] #=> "/templates/disclaimer_en.xsl"

Please note that there is an error in Nokogiri or libxml2 that causes the XML declaration to appear in the document as a processing instruction if there is at least one character (there may be a space) before <?xml . That's why in the above example, we search specifically for processing instructions called xml-stylesheet .

Edit : the expression XPath processing-instruction()[name()="foo"] equivalent to the expression processing-instruction("foo") . As described in the XPath 1.0 spec :

In the processing-instruction() test, there may be a Literal argument; in this case, this is true for any processing instruction that has a name equal to the value of the Literal.

I edited the answer above to use a shorter format.

Can Nokogiri search for tags? Xml-stylesheet?

More articles: