How can I get the value of the href attribute from table <? Xml-stylesheet> node?
We receive an XML document from a provider who must perform XSL conversion using their stylesheet so that we can convert the resulting HTML to PDF. The actual stylesheet is specified in the href attribute of the ?xml-stylesheet definition in the XML document. Is there a way I can get this URL using C #? I do not trust the provider to not change the URL and, obviously, I do not want to hardcode it.
The beginning of the XML file with the full element ?xml-stylesheet looks like this:
<?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type="text/xsl" href="http://www.fakeurl.com/StyleSheet.xsl"?> Linq to xml code:
XDocument xDoc = ...; var cssUrlQuery = from node in xDoc.Nodes() where node.NodeType == XmlNodeType.ProcessingInstruction select Regex.Match(((XProcessingInstruction)node).Data, "href=\"(?<url>.*?)\"").Groups["url"].Value; or linq for objects
var cssUrls = (from XmlNode childNode in doc.ChildNodes where childNode.NodeType == XmlNodeType.ProcessingInstruction && childNode.Name == "xml-stylesheet" select (XmlProcessingInstruction) childNode into procNode select Regex.Match(procNode.Data, "href=\"(?<url>.*?)\"").Groups["url"].Value).ToList(); xDoc.XPathSelectElement () will not work, since for some reasone it is impossible to pass XElement to XProcessingInstruction.
You can also use XPath. Given that the XmlDocument is loaded with your source:
XmlProcessingInstruction instruction = doc.SelectSingleNode("//processing-instruction(\"xml-stylesheet\")") as XmlProcessingInstruction; if (instruction != null) { Console.WriteLine(instruction.InnerText); } Then just parse InnerText with Regex.
As a processing instruction, it can have any content that formally does not have any attributes. But if you know that there are pseudo attributes, for example, in the case of an xml-stylesheet processing instruction, you can, of course, use the value of the processing instruction to create markup for one element and parse using the XML parser
XmlDocument doc = new XmlDocument(); doc.Load(@"file.xml"); XmlNode pi = doc.SelectSingleNode("processing-instruction('xml-stylesheet')"); if (pi != null) { XmlElement piEl = (XmlElement)doc.ReadNode(XmlReader.Create(new StringReader("<pi " + pi.Value + "/>"))); string href = piEl.GetAttribute("href"); Console.WriteLine(href); } else { Console.WriteLine("No pi found."); } To find the value using the correct XML parser, you can write something like this:
using(var xr = XmlReader.Create(input)) { while(xr.Read()) { if(xr.NodeType == XmlNodeType.ProcessingInstruction && xr.Name == "xml-stylesheet") { string s = xr.Value; int i = s.IndexOf("href=\"") + 6; s = s.Substring(i, s.IndexOf('\"', i) - i); Console.WriteLine(s); break; } } } private string _GetTemplateUrl(XDocument formXmlData) { var infopathInstruction = (XProcessingInstruction)formXmlData.Nodes().First(node => node.NodeType == XmlNodeType.ProcessingInstruction && ((XProcessingInstruction)node).Target == "mso-infoPathSolution"); var instructionValueAsDoc = XDocument.Parse("<n " + infopathInstruction.Data + " />"); return instructionValueAsDoc.Root.Attribute("href").Value; } XmlProcessingInstruction stylesheet = doc.SelectSingleNode ("processing instruction ('xml-stylesheet')") as XmlProcessingInstruction;