How to start PhantomJS as a server and call it remotely?

This is probably a very simple question. I would like to launch the PhantomJS dumb browser as a server, but not as a command line tool.

Once it is running, I would like to call it remotely over HTTP. The only thing I need is to send the url and return the HTML output. I need it to generate HTML for an AJAX application to make it searchable.

Is it possible?

+7
javascript seo phantomjs
source share
2 answers

You can run PhantomJS perfectly as a web server, because it has a web server module . The examples folder contains, for example, the server.js example . This runs autonomously without any dependencies (without node).

 var page = require('webpage').create(), server = require('webserver').create(); var service = server.listen(port, function (request, response) { console.log('Request received at ' + new Date()); // TODO: parse `request` and determine where to go page.open(someUrl, function (status) { if (status !== 'success') { console.log('Unable to post!'); } else { response.statusCode = 200; response.headers = { 'Cache': 'no-cache', 'Content-Type': 'text/plain;charset=utf-8' }; // TODO: do something on the page and generate `result` response.write(result); response.close(); } }); }); 

If you want to run PhantomJS through node.js, then this is also easily doable using phantomjs-node , which is the PhantomJS bridge for node.

 var http = require('http'); var phantom = require('phantom'); phantom.create(function (ph) { ph.createPage(function (page) { http.createServer(function (req, res) { // TODO: parse `request` and determine where to go page.open(someURL, function (status) { res.writeHead(200, {'Content-Type': 'text/plain'}); // TODO: do something on the page and generate `result` res.end(result); }); }).listen(8080); }); }); 

Notes

You can freely use this, since you do not have multiple requests at the same time. If so, then you need to either synchronize the requests (because there is only one page object), or you need to create a new page object for each request and close() again when you are done.

+22
source share

The easiest way is to create a python script or something simple to start the server and use the python web interfaces to communicate with it using a sorting web form to query the website and get the page source. Any automation can be done using cron jobs, or if you are on Windows, you can use the Tasks feature to autorun a python script.

+1
source share

All Articles