Saving a webpage as an image

Question

Saving a webpage as an image

As a hobby project, I am exploring ways to save a web page (HTML) as an image, mostly programmatically , using c / C ++ / javascript / java. So far I have come across the following methods:

Get the IHTMLElement of the page body and use it to query the IHTMLElementRender , and then use its DrawToDC method ( Link: http://www.codeproject.com/KB/IP/htmlimagecapture.aspx ). But the problem is that it does not work for all pages (mainly pages with embedded frames).
Another way I can think of is to use some component of the web browser, and when the pages are fully loaded, write it down with BitBlt ( Ref: http://msdn.microsoft.com/en-us/library/dd183370 % 28VS.85% 29.aspx ). But the problem is that the page I requested may be longer than my screen size, and it will not fit into the web browser component.

Any direction / suggestion for solving the above problems or an alternative approach is welcome.

+7

java c ++ javascript html image

Favonius Nov 09 '10 at 4:55

source share

4 answers

If you are using Python, pywebshot and webkit2png . However, both of them have some dependencies.

Edit: Unfortunately, Python is not on the list of preferred languages. In any case, I will leave this answer because you said "mostly" and not "exclusively".

+1

kijin Nov 09 '10 at 5:25

source share

Another (somewhat roundabout) option would be to start a server such as Tomcat and use Java to invoke the command line tool to take a screenshot. Googling for “command line screen screen windows” offers some reasonable options. However, besides starting the server, I do not know how to run local executables from javascript. This method will make it a cross browser, although it is a plus (just make an ajax call to the script if you want a screenshot).

Unfortunately, I really don't know how to deploy military files. Maybe more problems using Tomcat; I mentioned this because Java was the preferred language. It would be quite simple to run XAMPP and use this PHP fragment, and you would not need to learn php:

 <?php exec("/path/to/exec args"); ?>

EDIT

You know, I'm not sure what really answers your question. This is one way, but it comes to it from the end of JavaScript, and not from the end of the script. If you want to do this with scripts, you can always use Selenium. It supports capturing screenshots of the entire page and can be controlled through Java.

+1

theazureshadow Nov 09 '10 at 5:39

source share

If you use Javascript for this, I suggest phantomjs go

Example from http://fcargoet.evolix.net/

 var page = new WebPage(), address = 'http://dev.sencha.com/deploy/ext-4.0.7-gpl/examples/feed-viewer/feed-viewer.html'; page.viewportSize = { width : 800, height : 600 }; // define the components we want to capture var components = [{ output : 'feed-viewer-left.png', //ExtJS has a nice component query engine selector : 'feedpanel' },{ output : 'feed-viewer-preview-btn.png', selector : 'feeddetail > feedgrid > toolbar > cycle' },{ output : 'feed-viewer-collapsed.png', //executed before the rendering before : function(){ var panel = Ext.ComponentQuery.query('feedpanel')[0]; panel.animCollapse = false; // cancel animation, no need to wait before capture panel.collapse(); }, selector : 'viewport' }]; page.open(address, function (status) { if (status !== 'success') { console.log('Unable to load the address!'); } else { /* * give some time to ExtJS to * - render the application * - load asynchronous data */ window.setTimeout(function () { components.forEach(function(component){ //execute the before function component.before && page.evaluate(component.before); // get the rectangular area to capture /* * page.evaluate() is sandboxed * so that 'component' is not defined. * * It should be possible to pass variables in phantomjs 1.5 * but for now, workaround! */ eval('function workaround(){ window.componentSelector = "' + component.selector + '";}') page.evaluate(workaround); var rect = page.evaluate(function(){ // find the component var comp = Ext.ComponentQuery.query(window.componentSelector)[0]; // get its bounding box var box = comp.el.getBox(); // box is {x, y, width, height} // we want {top, left, width, height} box.top = box.y; box.left = box.x; return box; }); page.clipRect = rect; page.render(component.output); }); // job done, exit phantom.exit(); }, 2000); } });

0

Nemanja boric Feb 02 '13 at 22:11

source share

Favonius · Accepted Answer · 2011-01-03T16:06:08+0000

~~Well, I finally managed to crack it by looking at these two articles:~~

http://www.codeproject.com/KB/GDI-plus/WebPageSnapshot.aspx [C # code - IE]
http://www.codeproject.com/KB/graphics/IECapture.aspx [C ++ and GDI - IE]

You cannot use code, but these two articles will provide you with the best solution.

Also see:

https://addons.mozilla.org/en-US/firefox/addon/3408/ [firefox + javascript] Strike>

The above is still ok. BUT work is not always guaranteed. Check out the link below: How to render scrollable canvas areas using IViewObject :: Draw?

Saving a webpage as an image

More articles: