C # javascript-enabled headless browser for crawler

Can anyone suggest a browser without a .NET browser that supports cookies and automatically executes javascript?

+8
c # webclient headless-browser
source share
2 answers

Selenium + HtmlUnitDriver / GhostDriver is exactly what you are looking for. Oversimplified, Selenium is a library for using various browsers for automation - testing, cleaning, task automation.

There are various WebDriver classes with which you can control the actual browser. HtmlUnitDriver - headless. GhostDriver is a WebDriver for PhantomJS, so you can write C # while PhantomJS will do the hard work.

code snippet from Selenium docs for Firefox, but code with GhostDriver (PhantomJS) or HtmlUnitDriver is almost identical.

using OpenQA.Selenium; using OpenQA.Selenium.Firefox; using OpenQA.Selenium.Support.UI; class GoogleSuggest { static void Main(string[] args) { // driver initialization varies across different drivers // but they all support parameter-less constructors IWebDriver driver = new FirefoxDriver(); driver.Navigate().GoToUrl("http://www.google.com/"); IWebElement query = driver.FindElement(By.Name("q")); query.SendKeys("Cheese"); query.Submit(); WebDriverWait wait = new WebDriverWait(driver, TimeSpan.FromSeconds(10)); wait.Until((d) => { return d.Title.ToLower().StartsWith("cheese"); }); System.Console.WriteLine("Page title is: " + driver.Title); driver.Quit(); } } 

If you run this on a Windows computer, you can use the actual Firefox / Chrome driver, because it will open the actual browser window, which will work as programmed in your C #. HtmlUnitDriver is the easiest and fastest.

I successfully ran Selenium for C # (FirefoxDriver) on Linux using Mono . I believe that HtmlUnitDriver will also work as well as others, so if you need speed, I suggest you switch to Mono (you can develop, test and compile using Visual Studio on Windows, no problem) + Selenium HtmlUnitDriver running on a Linux host without a desktop,

+9
source share

I am not aware of a browser without a .NET browser, but there is always PhantomJS , which is C / C ++, and it works fairly well to help in unit testing JS with QUnit.

There is also another important question here that might help you - Headless Browser for C # (.NET)?

+4
source share

All Articles