C # HtmlAgilityPack Select table from specific h2
I have html:
<h2>Results</h2> <div class="box"> <table class="tFormat"> <th>Head</th> <tr>1</tr> </table> </div> <h2>Grades</h2> <div class="box"> <table class="tFormat"> <th>Head</th> <tr>1</tr> </table> </div> I was wondering how to get the table in the "Results" section
I tried:
var nodes = doc.DocumentNode.SelectNodes("//h2"); foreach (var o in nodes) { if (o.InnerText.Equals("Results")) { foreach (var c in o.SelectNodes("//table")) { Console.WriteLine(c.InnerText); } } } It works, but it also gets a table under the h2 marks
+6
2 answers
Note that the div is not hierarchically within the header, so it makes no sense to look for it there.
This may work for you - it finds the following element after the header:
if (o.InnerText.Equals("Results")) { var nextDiv = o.NextSibling; while (nextDiv != null && nextDiv.NodeType != HtmlNodeType.Element) nextDiv = nextDiv.NextSibling; // nextDiv should be correct here. } You can also write a more specific xpath to find only this div:
doc.DocumentNode.SelectNodes("//h2[text()='Results']/following-sibling::div[1]"); +5