The HTML Agility Pack contains examples of this type of thing and uses xpath for familiar queries - for example (from the home page), finding all links is simple:
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("// a@href ")) {
EDIT
As of 6/19/2012, the above code, as well as the only code example shown in the HTML Flexibility Package Examples will not work. Just a little tweaking is required, as shown below.
HtmlDocument doc = new HtmlDocument(); doc.Load("file.htm"); foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]")) { HtmlAttribute att = link.Attributes["href"]; att.Value = Foo(att);
source share