Is there a way to find internal nodes recursively from xml using C # or vb

I have an XML file, say

<items> <item1> <piece>1300</piece> <itemc>665583</itemc> </item1> <item2> <piece>100</piece> <itemc>665584</itemc> </item2> </items> 

I am trying to write a C # application to get the whole x-path for most nodes, for example:

 items/item1/piece items/item1/itemc items/item2/piece items/item2/itemc 

Is there a way to do this using C # or VB? Thanks in advance for a possible solution.

+2
source share
7 answers

There you go:

 static void Main() { XmlDocument doc = new XmlDocument(); doc.Load(@"C:\Test.xml"); foreach (XmlNode node in doc.DocumentElement.ChildNodes) { ProcesNode(node, doc.DocumentElement.Name); } } private void ProcesNode(XmlNode node, string parentPath) { if (!node.HasChildNodes || ((node.ChildNodes.Count == 1) && (node.FirstChild is System.Xml.XmlText))) { System.Diagnostics.Debug.WriteLine(parentPath + "/" + node.Name); } else { foreach (XmlNode child in node.ChildNodes) { ProcesNode(child, parentPath + "/" + node.Name); } } } 

The above code generates the desired result for any type of file. Please add checks as needed. The main part is that we ignore the text node (the text inside the node) from the output.

+4
source
 //*[not(*)] 

is XPath to find all subelements without children, so you can do something like

 doc.SelectNodes("//*[not(*)]") 

but I'm not quite sure about API.Net, so check it out.

Link

 // --> descendant (not only children) * --> any name [] --> predicate to evaluate not(*) --> not having children 
+15
source

Just to expand the answer to helios a bit, you can improve your xpath with [text ()] only for specific nodes that have text () node:

 // XDocument foreach(XElement textNode in xdoc.XPathSelectElements("//*[not(*)][text()]")) { Console.WriteLine(textNode.Value); } // XmlDocument foreach(XmlText textNode in doc.SelectNodes("//*[not(*)]/text()")) { Console.WriteLine(textNode.Value); } 
+2
source

Here is an XSLT solution that expresses XPATH expressions for each of the innermost elements.

 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <xsl:apply-templates /> </xsl:template> <!--Match on all elements that do not contain child elements --> <xsl:template match="//*[not(*)]"> <!--look up the node tree and write out: - a slash - the name of the element - and a predicate filter for the position of the element at each step --> <xsl:for-each select="ancestor-or-self::*"> <xsl:text>/</xsl:text> <xsl:value-of select="local-name()"/> <!--add a predicate filter to specify the position, in case there are more than one element with that name at that step --> <xsl:text>[</xsl:text> <xsl:value-of select="count(preceding-sibling::*[name()=name(current())])+1" /> <xsl:text>]</xsl:text> </xsl:for-each> <!--Create a new line after ever element --> <xsl:text>&#xA;</xsl:text> </xsl:template> <!--override default template to prevent extra whitespace and carriage return from being copied into the output--> <xsl:template match="text()" /> </xsl:stylesheet> 

I added predicate filters to indicate the position of the element. Thus, if you had more than one piece or itemc at the same level, XPATH would indicate the correct one.

So, instead of:

 items/item1/piece items/item1/itemc items/item2/piece items/item2/itemc 

he produces:

 /items[1]/item1[1]/piece[1] /items[1]/item1[1]/itemc[1] /items[1]/item2[1]/piece[1] /items[1]/item2[1]/itemc[1] 
+2
source

The following code contains all the sheet elements in the document and for each displays an XPath expression that will unambiguously move through the element from the document root, including the predicate at each node step, to disambiguate between elements with the same name:

 static void Main(string[] arguments) { XDocument d = XDocument.Load("xmlfile1.xml"); foreach (XElement e in d.XPathSelectElements("//*[not(*)]")) { Console.WriteLine("/" + string.Join("/", e.XPathSelectElements("ancestor-or-self::*") .Select(x => x.Name.LocalName + "[" + (x.ElementsBeforeSelf() .Where(y => y.Name.LocalName == x.Name.LocalName) .Count() + 1) + "]") .ToArray())); } Console.ReadKey(); } 

For example, this input:

 <foo> <bar> <fizz/> <baz> <bat/> </baz> <fizz/> </bar> <buzz></buzz> </foo> 

produces this conclusion:

 /foo[1]/bar[1]/fizz[1] /foo[1]/bar[1]/baz[1]/bat[1] /foo[1]/bar[1]/fizz[2] /foo[1]/buzz[1] 
+1
source

It is untested and prob needs some work done to get the compilation, but do you want something like this?

 class Program { static void Main() { XmlDocument xml = new XmlDocument(); xml.Load("test.xml"); var toReturn = new List<string>(); GetPaths(string.Empty, xml.ChildNodes[0], toReturn); } public static void GetPaths(string pathSoFar, XmlNode node, List<string> results) { string scopedPath = pathSoFar + node.Name + "/"; if (node.HasChildNodes) { foreach (XmlNode itemNode in node.ChildNodes) { GetPaths(scopedPath, itemNode, results); } } else { results.Add(scopedPath); } } } 

For large xml snippets, although this may not be very efficient.

0
source

This may not be the fastest solution, but it shows that arbitrary XPath expressions are used as a selector, and it seems to me that this also most clearly expresses the intent of the code.

 class Program { static void Main(string[] args) { XmlDocument xml = new XmlDocument(); xml.Load("test.xml"); IEnumerable innerItems = (IEnumerable)e.XPathEvaluate("//*[not(*)]"); foreach (XElement innerItem in innerItems) { Console.WriteLine(GetPath(innerItem)); } } public static string GetPath(XElement e) { if (e.Parent == null) { return "/" + e.Name; } else { return GetPath(e.Parent) + "/" + e.Name; } } } 
0
source

All Articles