Get HTML source code from CefSharp web browser

I am using aCefSharp.Wpf.ChromiumWebBrowser (version 47.0.3.0) to load a web page. After the page is loaded, I want to get the source code.

I called:

wb.GetBrowser().MainFrame.GetSourceAsync() 

however, it does not seem to return the entire source code (I believe this is because there are child frames).

If I call:

 wb.GetBrowser().MainFrame.ViewSource() 

I see that it lists all the source codes (including internal frames).

I would like to get the same result as ViewSource (). Can someone point me in the right direction?

Update - added sample code

Note. The address pointed to by the web browser will only work until 03/10/2016. After that, it can display different data, which is not what I would look at.

In the frmSelection.xaml file

 <cefSharp:ChromiumWebBrowser Name="wb" Grid.Column="1" Grid.Row="0" /> 

In the frmSelection.xaml.cs file

 public partial class frmSelection : UserControl { private System.Windows.Threading.DispatcherTimer wbTimer = new System.Windows.Threading.DispatcherTimer(); public frmSelection() { InitializeComponent(); // This timer will start when a web page has been loaded. // It will wait 4 seconds and then call wbTimer_Tick which // will then see if data can be extracted from the web page. wbTimer.Interval = new TimeSpan(0, 0, 4); wbTimer.Tick += new EventHandler(wbTimer_Tick); wb.Address = "http://www.racingpost.com/horses2/cards/card.sd?race_id=644222&r_date=2016-03-10#raceTabs=sc_"; wb.FrameLoadEnd += new EventHandler<CefSharp.FrameLoadEndEventArgs>(wb_FrameLoadEnd); } void wb_FrameLoadEnd(object sender, CefSharp.FrameLoadEndEventArgs e) { if (wbTimer.IsEnabled) wbTimer.Stop(); wbTimer.Start(); } void wbTimer_Tick(object sender, EventArgs e) { wbTimer.Stop(); string html = GetHTMLFromWebBrowser(); } private string GetHTMLFromWebBrowser() { // call the ViewSource method which will open up notepad and display the html. // this is just so I can compare it to the html returned in GetSourceAsync() // This is displaying all the html code (including child frames) wb.GetBrowser().MainFrame.ViewSource(); // Get the html source code from the main Frame. // This is displaying only code in the main frame and not any child frames of it. Task<String> taskHtml = wb.GetBrowser().MainFrame.GetSourceAsync(); string response = taskHtml.Result; return response; } } 
+7
c # wpf cefsharp
source share
2 answers

I do not think I got this DispatcherTimer solution. I would do it like this:

 public frmSelection() { InitializeComponent(); wb.FrameLoadEnd += WebBrowserFrameLoadEnded; wb.Address = "http://www.racingpost.com/horses2/cards/card.sd?race_id=644222&r_date=2016-03-10#raceTabs=sc_"; } private void WebBrowserFrameLoadEnded(object sender, FrameLoadEndEventArgs e) { if (e.Frame.IsMain) { wb.ViewSource(); wb.GetSourceAsync().ContinueWith(taskHtml => { var html = taskHtml.Result; }); } } 

I made diff to output the ViewSource and the text in the html variable, and they are the same, so I can not reproduce your problem here.

That said, I noticed that the main frame loads quite late, so you need to wait quite a while until the notepad appears with the source.

+11
source share

I had the same problem as trying to click and an element located in the frame, and not on the main frame. Using the example in your answer, I wrote the following extension method:

  public static IFrame GetFrame(this ChromiumWebBrowser browser, string FrameName) { IFrame frame = null; var identifiers = browser.GetBrowser().GetFrameIdentifiers(); foreach (var i in identifiers) { frame = browser.GetBrowser().GetFrame(i); if (frame.Name == FrameName) return frame; } return null; } 

If you have a “use” in your form for a module that contains this method, you can do something like:

 var frame = browser.GetFrame("nameofframe"); if (frame != null) { string HTML = await frame.GetSourceAsync(); } 

Of course, you need to make sure that the page loading is completed before using it, but I plan to use it a lot. Hope this helps!

Jim

+1
source share

All Articles