HtmlDocument.Write Separation Quotes

For some reason, when I try to write an HtmlDocument, it breaks some (not all) quotes of the string that I give it.

Look at here:

HtmlDocument htmlDoc = Webbrowser1.Document.OpenNew(true); htmlDoc.Write("<HTML><BODY><DIV ID=\"TEST\"></DIV></BODY></HTML>"); string temp = htmlDoc.GetElementsByTagName("HTML")[0].InnerHtml; 

The result of temp is the following:

 <HEAD></HEAD> <BODY> <DIV id=TEST></DIV></BODY> 

It works exactly as it should, except that it strips out quotes. Does anyone have a decision on how to prevent or fix this?

+4
source share
2 answers

There is no guarantee with innerHTML that it will return content identical to the line you passed to. innerHTML is created by the browser using its HTML tree representation, so it will create the resulting string as it sees fits.

Thus, depending on your needs, you can try using HTML parsing code that understands the identifier without quotes around OR try to convince the browser to use the latest engine, which is more likely to give you innerHTML.

those. in your case it looks at least IE9 displays your HTML as IE9: Quirks mode (which returns innerHTML in a way that doesn't suit you), if you make valid HTML or force mode for IE9: Standard you will get a line with qoutes like

 document.getElementsByTagName("html")[0].innerHTML 

IE9: Standards - "<head></head><body><div id="TEST"></div></body>"

IE9: Quirks -

 "<HEAD></HEAD> <BODY> <DIV id=TEST></DIV></BODY>" 

You can try it yourself by creating a sample HTML file and opening it from disk. F12 to show developer tools and test mode in the menu bar.

0
source

C # has a fancy feature, although I'm not sure about that name. Sorry, I'm not sure about the vb equivalent.

Add @ to the beginning of the line to avoid all characters.

 htmlDoc.Write(@"<HTML><BODY><DIV ID="TEST"></DIV></BODY></HTML>"); 

Also, this is not important, but your html will not check. All tags and attributes must be lowercase. For example, <HTML> should be <HTML> .

0
source

All Articles