Get ReadyState Using a WebBrowser Control Without DoEvents

It has been many times here, on other sites and in his work, but I would like ideas to be in other ways:

get ReadyState = Complete after using navigation or message without using DoEvents due to all its shortcomings.

I would also like to note that using the DocumentComplete woud event here does not help, as I will not be navigating only on one page, but one by one like this.

wb.navigate("www.microsoft.com") //dont use DoEvents loop here wb.Document.Body.SetAttribute(textbox1, "login") //dont use DoEvents loop here if (wb.documenttext.contais("text")) //do something 

The way it works today using DoEvents. I would like to know if anyone has a proper way to wait for the asynchronous call of the browser methods, only then to continue working with the rest of the logic. Just for the sake of it.

Thanks in advance.

+2
c # webbrowser-control readystate
Jan 15 '14 at 14:40
source share
3 answers

Below is the base WinForms application code illustrating how to expect a DocumentCompleted asynchronous event using async/await . It moves through several pages one after another. Everything happens in the main user interface thread.

Instead of calling this.webBrowser.Navigate(url) it can mimic a click of a form button to trigger POST-style navigation.

The logic of the webBrowser.IsBusy asynchronous loop is optional, its purpose is to account (non-deterministically) for the dynamic AJAX code of the page, which can occur after the window.onload event.

 using System; using System.Diagnostics; using System.Threading; using System.Threading.Tasks; using System.Windows.Forms; namespace WebBrowserApp { public partial class MainForm : Form { WebBrowser webBrowser; public MainForm() { InitializeComponent(); // create a WebBrowser this.webBrowser = new WebBrowser(); this.webBrowser.Dock = DockStyle.Fill; this.Controls.Add(this.webBrowser); this.Load += MainForm_Load; } // Form Load event handler async void MainForm_Load(object sender, EventArgs e) { // cancel the whole operation in 30 sec var cts = new CancellationTokenSource(30000); var urls = new String[] { "http://www.example.com", "http://www.gnu.org", "http://www.debian.org" }; await NavigateInLoopAsync(urls, cts.Token); } // navigate to each URL in a loop async Task NavigateInLoopAsync(string[] urls, CancellationToken ct) { foreach (var url in urls) { ct.ThrowIfCancellationRequested(); var html = await NavigateAsync(ct, () => this.webBrowser.Navigate(url)); Debug.Print("url: {0}, html: \n{1}", url, html); } } // asynchronous navigation async Task<string> NavigateAsync(CancellationToken ct, Action startNavigation) { var onloadTcs = new TaskCompletionSource<bool>(); EventHandler onloadEventHandler = null; WebBrowserDocumentCompletedEventHandler documentCompletedHandler = delegate { // DocumentCompleted may be called several time for the same page, // if the page has frames if (onloadEventHandler != null) return; // so, observe DOM onload event to make sure the document is fully loaded onloadEventHandler = (s, e) => onloadTcs.TrySetResult(true); this.webBrowser.Document.Window.AttachEventHandler("onload", onloadEventHandler); }; this.webBrowser.DocumentCompleted += documentCompletedHandler; try { using (ct.Register(() => onloadTcs.TrySetCanceled(), useSynchronizationContext: true)) { startNavigation(); // wait for DOM onload event, throw if cancelled await onloadTcs.Task; } } finally { this.webBrowser.DocumentCompleted -= documentCompletedHandler; if (onloadEventHandler != null) this.webBrowser.Document.Window.DetachEventHandler("onload", onloadEventHandler); } // the page has fully loaded by now // optional: let the page run its dynamic AJAX code, // we might add another timeout for this loop do { await Task.Delay(500, ct); } while (this.webBrowser.IsBusy); // return the page HTML content return this.webBrowser.Document.GetElementsByTagName("html")[0].OuterHtml; } } } 

If you want to do something like this from a console application, here is an example of this .

+2
Jan 16 '14 at 3:45
source share

The solution is simple:

  // MAKE SURE ReadyState = Complete while (WebBrowser1.ReadyState.ToString() != "Complete") { Application.DoEvents(); } 

// Go to your subsequence code ...




It's dirty and fast .. I’m VBA guys, this logic works forever, it just took me days and didn’t find any for C #, but I just thought about it myself.

The following is my complete function, the goal is to get a segment of information from a web page:

 private int maxReloadAttempt = 3; private int currentAttempt = 1; private string GetCarrier(string webAddress) { WebBrowser WebBrowser_4MobileCarrier = new WebBrowser(); string innerHtml; string strStartSearchFor = "subtitle block pull-left\">"; string strEndSearchFor = "<"; try { WebBrowser_4MobileCarrier.ScriptErrorsSuppressed = true; WebBrowser_4MobileCarrier.Navigate(webAddress); // MAKE SURE ReadyState = Complete while (WebBrowser_4MobileCarrier.ReadyState.ToString() != "Complete") { Application.DoEvents(); } // LOAD HTML innerHtml = WebBrowser_4MobileCarrier.Document.Body.InnerHtml; // ATTEMPT (x3) TO EXTRACT CARRIER STRING while (currentAttempt <= maxReloadAttempt) { if (innerHtml.IndexOf(strStartSearchFor) >= 0) { currentAttempt = 1; // Reset attempt counter return Sub_String(innerHtml, strStartSearchFor, strEndSearchFor, "0"); // Method: "Sub_String" is my custom function } else { currentAttempt += 1; // Increment attempt counter GetCarrier(webAddress); // Recursive method call } // End if } // End while } // End Try catch //(Exception ex) { } return "Unavailable"; } 
+1
Jul 23 '14 at 13:38 on
source share

Here is a quick and dirty solution. It is not 100% reliable, but it does not block the user interface flow, and it should be satisfactory for the WebBrowser control prototype. Automation Procedures:

  private async void testButton_Click(object sender, EventArgs e) { await Task.Factory.StartNew( () => { stepTheWeb(() => wb.Navigate("www.yahoo.com")); stepTheWeb(() => wb.Navigate("www.microsoft.com")); stepTheWeb(() => wb.Navigate("asp.net")); stepTheWeb(() => wb.Document.InvokeScript("eval", new[] { "$('p').css('background-color','yellow')" })); bool testFlag = false; stepTheWeb(() => testFlag = wb.DocumentText.Contains("Get Started")); if (testFlag) { /* TODO */ } // ... } ); } private void stepTheWeb(Action task) { this.Invoke(new Action(task)); WebBrowserReadyState rs = WebBrowserReadyState.Interactive; while (rs != WebBrowserReadyState.Complete) { this.Invoke(new Action(() => rs = wb.ReadyState)); System.Threading.Thread.Sleep(300); } } 

Here is a slightly more general version of the testButton_Click method:

  private async void testButton_Click(object sender, EventArgs e) { var actions = new List<Action>() { () => wb.Navigate("www.yahoo.com"), () => wb.Navigate("www.microsoft.com"), () => wb.Navigate("asp.net"), () => wb.Document.InvokeScript("eval", new[] { "$('p').css('background-color','yellow')" }), () => { bool testFlag = false; testFlag = wb.DocumentText.Contains("Get Started"); if (testFlag) { /* TODO */ } } //... }; await Task.Factory.StartNew(() => actions.ForEach((x)=> stepTheWeb (x))); } 

[Update]

I adapted my “quick and dirty” sample, borrowing and sligthly refactoring the @Noseratio NavigateAsync method from this section . The new version of the code automated / performed asynchronously in the context of the user interface stream not only navigation operations, but also Javascript / AJAX calls - any methods for implementing the "lamda" task / one automation step.

All comments and code comments are very welcome. Especially from @Noseratio . Together we will make this world a better place;)

  public enum ActionTypeEnumeration { Navigation = 1, Javascript = 2, UIThreadDependent = 3, UNDEFINED = 99 } public class ActionDescriptor { public Action Action { get; set; } public ActionTypeEnumeration ActionType { get; set; } } /// <summary> /// Executes a set of WebBrowser control Automation actions /// </summary> /// <remarks> /// Test form shoudl ahve the following controls: /// webBrowser1 - WebBrowser, /// testbutton - Button, /// testCheckBox - CheckBox, /// totalHtmlLengthTextBox - TextBox /// </remarks> private async void testButton_Click(object sender, EventArgs e) { try { var cts = new CancellationTokenSource(60000); var actions = new List<ActionDescriptor>() { new ActionDescriptor() { Action = ()=> wb.Navigate("www.yahoo.com"), ActionType = ActionTypeEnumeration.Navigation} , new ActionDescriptor() { Action = () => wb.Navigate("www.microsoft.com"), ActionType = ActionTypeEnumeration.Navigation} , new ActionDescriptor() { Action = () => wb.Navigate("asp.net"), ActionType = ActionTypeEnumeration.Navigation} , new ActionDescriptor() { Action = () => wb.Document.InvokeScript("eval", new[] { "$('p').css('background-color','yellow')" }), ActionType = ActionTypeEnumeration.Javascript}, new ActionDescriptor() { Action = () => { testCheckBox.Checked = wb.DocumentText.Contains("Get Started"); }, ActionType = ActionTypeEnumeration.UIThreadDependent} //... }; foreach (var action in actions) { string html = await ExecuteWebBrowserAutomationAction(cts.Token, action.Action, action.ActionType); // count HTML web page stats - just for fun int totalLength = 0; Int32.TryParse(totalHtmlLengthTextBox.Text, out totalLength); totalLength += !string.IsNullOrWhiteSpace(html) ? html.Length : 0; totalHtmlLengthTextBox.Text = totalLength.ToString(); } } catch (Exception ex) { MessageBox.Show(ex.Message, "Error"); } } // asynchronous WebBroswer control Automation async Task<string> ExecuteWebBrowserAutomationAction( CancellationToken ct, Action runWebBrowserAutomationAction, ActionTypeEnumeration actionType = ActionTypeEnumeration.UNDEFINED) { var onloadTcs = new TaskCompletionSource<bool>(); EventHandler onloadEventHandler = null; WebBrowserDocumentCompletedEventHandler documentCompletedHandler = delegate { // DocumentCompleted may be called several times for the same page, // if the page has frames if (onloadEventHandler != null) return; // so, observe DOM onload event to make sure the document is fully loaded onloadEventHandler = (s, e) => onloadTcs.TrySetResult(true); this.wb.Document.Window.AttachEventHandler("onload", onloadEventHandler); }; this.wb.DocumentCompleted += documentCompletedHandler; try { using (ct.Register(() => onloadTcs.TrySetCanceled(), useSynchronizationContext: true)) { runWebBrowserAutomationAction(); if (actionType == ActionTypeEnumeration.Navigation) { // wait for DOM onload event, throw if cancelled await onloadTcs.Task; } } } finally { this.wb.DocumentCompleted -= documentCompletedHandler; if (onloadEventHandler != null) this.wb.Document.Window.DetachEventHandler("onload", onloadEventHandler); } // the page has fully loaded by now // optional: let the page run its dynamic AJAX code, // we might add another timeout for this loop do { await Task.Delay(500, ct); } while (this.wb.IsBusy); // return the page HTML content return this.wb.Document.GetElementsByTagName("html")[0].OuterHtml; } 
0
Jan 16 '14 at 0:07
source share



All Articles