I used HtmlAgilityPack in the past to parse HTML in .Net, but I don't like the fact that it only uses the DOM.
In large documents and / or with heavy levels of nesting, you can use stack overflow or memory exception. In addition, in the general case, the DOM-based parsing model uses significantly more memory than the stream approach, usually because a process that wants to consume HTML may require only a few elements that will be available at a time.
Does anyone know of a decent HTML parser for .Net that allows you to parse HTML just like the XmlReader class? i.e. in direct forward mode
source share