Most of the answers I read regarding this topic point to either the System.Windows.Forms.WebBrowser class or the COM interface mshtml.HTMLDocument from the Microsoft HTML Object Library.
The WebBrowser class has not brought me anywhere. The following code cannot get the HTML code displayed by my web browser:
[STAThread] public static void Main() { WebBrowser wb = new WebBrowser(); wb.Navigate("https://www.google.com/#q=where+am+i"); wb.DocumentCompleted += delegate(object sender, WebBrowserDocumentCompletedEventArgs e) { mshtml.IHTMLDocument2 doc = (mshtml.IHTMLDocument2)wb.Document.DomDocument; foreach (IHTMLElement element in doc.all) { System.Diagnostics.Debug.WriteLine(element.outerHTML); } }; Form f = new Form(); f.Controls.Add(wb); Application.Run(f); }
The above example. I’m not very interested in finding a workaround to find out the name of the city where I am located. I just need to understand how to programmatically retrieve dynamically generated data.
(Call the new System.Net.WebClient.DownloadString (" https://www.google.com/#q=where+am+i "), save the received text somewhere, find the name of the city in which you are currently located, and let me know if you can find it.)
But when I access https://www.google.com/#q=where+am+i from my web browser (i.e. firefox), I see the name of my city written on the web page. In Firefox, if I right-click on a city name and select "Inspect Element (Q)", I can clearly see the city name written in HTML code, which seems to be very different from the raw HTML returned by WebClient.
After I was tired of playing System.Net.WebBrowser, I decided to give mshtml.HTMLDocument a shot to end up with the same useless raw HTML:
public static void Main() { mshtml.IHTMLDocument2 doc = (mshtml.IHTMLDocument2)new mshtml.HTMLDocument(); doc.write(new System.Net.WebClient().DownloadString("https://www.google.com/#q=where+am+i")); foreach (IHTMLElement e in doc.all) { System.Diagnostics.Debug.WriteLine(e.outerHTML); } }
I suppose there should be an elegant way to get such information. Now I can only add the WebBrowser control to the form, go to the URL in question, send the keys "CLRL, A" and copy everything that will be displayed on the page to the clipboard, and try to analyze it. This is a terrible decision.
javascript html c # webbrowser-control
J Smith Jan 05 '14 at 5:23 2014-01-05 05:23
source share