Javascript (and HTML rendering) without a GUI for automation?

Are there any libraries or frameworks that provide browser functionality, but do not physically need to physically display the screen?

I want to automate navigation on web pages (e.g. Mechanize does this), but I need a full browser, including Javascript. Thus, I would like to have some kind of virtual browser that I can use to “click on links” programmatically, it has DOM elements and JS scripts, and manipulate these elements.

A solution is desirable in Python, but I can manage others.

+3
source share
7 answers

PhantomJS and PyPhantomJS are what I use for such tasks.

What it is, it is a browser without a browser, based on WebKit, which is completely controlled using JavaScript. It implements the implementation of C ++ (PhantomJS) and Python (PyPhantomJS). I prefer Python, though, since it has a plug-in system that allows you to add functionality to the kernel without changing any code, unlike C ++. :)

+3
source

There is currently an absolute ton of free software technology: take your pick at http://wiki.python.org/moin/WebBrowserProgramming , but if you have specific questions, join -dev pajamas on google groups and I will be happy provide you with additional information. short answer: you can run pywebkitgtk "headless", or you can use xulrunner (via python-hulahop) again using pygtk, without actually executing "browserwidget.show ()" as well as pykhtml. also you can use python COM to connect to MSHTML.DLL.

these are all “cheat methods”: using python bindings to the web browser graphics engine without actually launching the graphics bit. if you really would like to start serious programming, you could create a webkit "port" that was not connected to the GUI tools: as an experienced web kite programmer, I would put it like ... 2 weeks full time to do this Headless version of webkit.

. L

+2
source

Looks like http://watin.sourceforge.net/ might be a good way.

If you don't need to clean Python, you can do IronPython since this is a C # project.

+1
source

take a look at this little doosy on ajaxian

http://ajaxian.com/archives/server-side-rendering-with-yui-on-node-js

It also talks about Aptana Jaxer , which I think runs on headless firefox, so it's basically the Mozilla browser in all its glory.

+1
source

There is Kapou. Its pure Java and costs money:

http://kapowtech.com/

And there is Lixto: its based on Eclipse and uses Mozilla Gecko as a rendering mechanism (unless they already changed it to WebKit, as they said they would do many years ago). Its very nice and also worth the money:

http://www.lixto.com/?page_id=50

They are graphical tools in which you define site navigation and what you need to extract from points and clicks. But you can also write xpath and regular expressions, and even JavaScript that runs in the context of sites.

I used them both in extracting web data from lectures and when using web data at the Technical University of Vienna (Lixto was written by a professor who gave the lecture).

+1
source

HTMLUnit in Java is very good. I think the only Java implementation is the mute browsers that can provide Javascript support.

MaxQ , I read about here , it seems like it might be interesting: "written in Java, generating Jython scripts"

+1
source

Try HtmlUnit !!!

0
source

Source: https://habr.com/ru/post/922955/


All Articles