Building a Tree of HtmlElement Objects

Question

Building a Tree of HtmlElement Objects

I am using the MSIE WebBrowser in a C # working application and am looking for a way to create and maintain HtmlElement object HtmlElement outside this control. I'm trying to quickly switch between multiple complex pages without imposing the overhead of re-parsing the HTML every time (and I don't want to support multiple controls that are shown / hidden as needed). I found that a) I can create HtmlElement objects through the HtmlDocument control and b) as soon as I remove the “tube” of HtmlElement objects from the HtmlDocument control, it “dies”, although I continue to maintain a strong link to the root element. How can i do this?

PS I would like to consider alternative browser controls (like Gecko) if they allow me to do the above.

+4

dom html browser webbrowser-control

Tony the pony Mar 10 '09 at 19:44

source share

4 answers

You can use the MSHTML library (mshtml.dll) for this. Basically you would use one o: blank page, and then dynamically record and delete content from it.

See this blog post on this subject.

You can also write a special interface shell that provides the functionality you need from mshtml, and does not refer to all this (almost 8 MB), and it is really easy to do this using f12 in VS.

+2

Fraser Mar 12 '09 at 23:19

source share

Do you really need to delete them? How about leaving your "branch" in the DOM as a child of a DIV whose style is = "display: none". So they are real, DOM objects live, but not visible.

+1

jlew Mar 10 '09 at 20:28

source share

I think you can also use htmlagilitypack. It allows you to parse once by querying the HTML tree using XPath or using iterators and overwriting the tree using the save method when this is done. Depending on your structure, you can simply create an adapter around the classes, because it works only on the entire html document, and you want it only on elements, but this should not be too complicated.

0

weismat Mar 19 '09 at 5:16

source share

TFD · Accepted Answer · 2009-03-13T00:32:53+0000

It will do it

 // On screen webbrowser control webBrowserControl.Navigate("about:blank"); webBrowserControl.Document.Write("<div id=\"div1\">This will change</div>"); var elementToReplace = webBrowserControl.Document.GetElementById("div1"); var nodeToReplace = elementToReplace.DomElement as mshtml.IHTMLDOMNode; // In memory webbrowser control to load fragement into // It needs this base object as it is a COM control var webBrowserFragement = new WebBrowser(); webBrowserFragement.Navigate("about:blank"); webBrowserFragement.Document.Write("<div id=\"div1\">Hello World!</div>"); var elementReplacement = webBrowserFragement.Document.GetElementById("div1"); var nodeReplacement = elementReplacement.DomElement as mshtml.IHTMLDOMNode; // The magic happens here! nodeToReplace.replaceNode(nodeReplacement);

I doubt that this will improve the work, since the text rendering will be fast and the memory consumed will still be the same if you have one large page with a hidden div or several divs in memory in other objects?

Building a Tree of HtmlElement Objects

More articles: