How should I properly call WebBrowser using multiprocessor?

Scope of the task:

I am writing an application to save HTML derived from Bing and Google . I know there are classes for executing web requests using a stream such as this example , but since Google and Bing use Javascript and Ajax to display the results in HTML, I cannot just read the stream and use get for the result I need .

The solution to this is to use the WebBrowser class and go to the URL that I want, so that the browser itself will process all Javascript and Ajax scripts.

MultiThreading:

To make it more efficient, I have the same form application that launches a thread for each service (one for Bing and one for Google).

Problem:

Since I need a WebBrowser, I created an instance for each stream (which is currently 2). According to Microsoft, there is a known bug that prevents the DocumentCompleted event from firing if the WebBrowser is not visible and is not added to a visible form aswell (for more information, follow this link ).

The real problem:

The main problem is that the DocumentCompleted event in the browser never fires. Never.

I wrote the correct DocumentCompleted event handler that never gets a callback. To handle the wait expected for the Browser event, I ran an AutoResetEvent with a high timeout (5 minutes), which the webbrowser thread would have if it does not fire the event I need after 5 minutes.

At the moment, I have a browser created and added to WindowsForm, both of them are visible, and the event still does not fire.

The code:

  // Creating Browser Instance browser = new WebBrowser (); // Setting up Custom Handler to "Document Completed" Event browser.DocumentCompleted += DocumentCompletedEvent; // Setting Up Random Form genericForm = new Form(); genericForm.Width = 200; genericForm.Height = 200; genericForm.Controls.Add (browser); browser.Visible = true; 

Regarding navigation, I have the following (method for browser):

  public void NavigateTo (string url) { CompletedNavigation = false; if (browser.ReadyState == WebBrowserReadyState.Loading) return; genericForm.Show (); // Shows the form so that it is visible at the time the browser navigates browser.Navigate (url); } 

And to call the navigation I have this:

  // Loading URL browser.NavigateTo(URL); // Waiting for Our Event To Fire if (_event.WaitOne (_timeout)) { // Success } { // Error / Timeout From the AutoResetEvent } 

TL: DR:

My WebBrowser is created in another STAThread, added to the form, both of them are visible and displayed when the browser navigation starts, but the DocumentCompleted event from the browser never fires, so AutoResetEvent always turns off and I don’t have a response from the browser.

Thanks in Advance and sorry for the long post.

+3
multithreading c # browser
May 23 '13 at 13:45
source share
1 answer

Although this seems odd, here is my attempt.

 var tasks = new Task<string>[] { new MyDownloader().Download("http://www.stackoverflow.com"), new MyDownloader().Download("http://www.google.com") }; Task.WaitAll(tasks); Console.WriteLine(tasks[0].Result); Console.WriteLine(tasks[1].Result); 



 public class MyDownloader { WebBrowser _wb; TaskCompletionSource<string> _tcs; ApplicationContext _ctx; public Task<string> Download(string url) { _tcs = new TaskCompletionSource<string>(); var t = new Thread(()=> { _wb = new WebBrowser(); _wb.ScriptErrorsSuppressed = true; _wb.DocumentCompleted += _wb_DocumentCompleted; _wb.Navigate(url); _ctx = new ApplicationContext(); Application.Run(_ctx); }); t.SetApartmentState(ApartmentState.STA); t.Start(); return _tcs.Task; } void _wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { //_tcs.TrySetResult(_wb.DocumentText); _tcs.TrySetResult(_wb.DocumentTitle); _ctx.ExitThread(); } } 
+2
May 23 '13 at 14:11
source share



All Articles