Saving a web page and external resources as an independent static resource

We have a requirement to cache web pages as accurately as possible so that we can return and view the page version at any previous point in time. We would like to be able to view the page as it really is - with the correct CSS, javascript, images, etc.

Are there any OS libraries (any language) that will extract the page, load all assets associated with the outside world, and rewrite links such as they point to locally cached assets?

Or is this the case when we roll?

thank

Edit: I understand that without rendering dynamically generated links, etc. it will not be 100% possible if we do not render the DOM. However, at the moment, we may be living without it.

+5
source share
3 answers

I suggest HTTrack: http://www.httrack.com/

Since the software is free, open source and supports both a visual interface and a command line, I believe that you can integrate it or customize it to your needs smoothly.

See description :

"HTTrack - World Wide Web , , HTML, .

. "" - , , .

. "

:

WebHTTrack Linux/Unix/BSD: Debian, Ubuntu, Gentoo, RPM (Mandriva RedHat), OSX (MacPorts), Fedora FreeBSD i386.

WinHTTrack Windows 2000/XP/Vista/Seven

-

: , ​​ 04/01/2017

+9

, ?

+1
0

All Articles