we are mainly tasked with emulating a browser to retrieve web pages, trying to automate tests on different web pages. This will be used for (ideally) console applications that run in the background and generate reports.
We tried to work with .NET and the WatiN library, but it was built on Marshalled IE, and therefore it did not have many functions that we hacked using unmanaged code calls, but at the end of the day IE was not a safe thread or a secure process, and many of the necessary functions can only be realized by changing the registry values, and it was just terribly inflexible.
- Proxy support.
- JavaScript support - we should be able to parse the actual DOM after executing any javascript (and hopefully to improve performance for any ajax calls)
- Ability to save entire page content, including FROM images. CACHE loaded page in a separate location
- ability to clear cookies / cache, receive cookies / cache, etc.
- Ability to set headers and change message data for any browser call
- A process and / or in-line safe would be ideal
- And for the love of drogs, an API that is not completely cryptic
Languages acceptable for C ++, C #, Python, everything that can be a simple small background application that is somewhat tolerant and does not have a completely "unconventional" syntax such as Ruby.
, Google, WebKit... Qt- QtWebKit ?