Display html page and save it using command line

I would like to load a web page and save it using the command line (want the same behavior that we get for the save page, like for a full page in firefox or chrome.)

I tried using wget and httrack, they gave me html files correctly. But in the case of incorrect html, the browser fixes it when rendering and using save, because there we get the corrected html, but this does not happen in the case of wget or htttrack.

Is there any tool that would display the page and save the page along with all the images and flash memory and all other things in a local format.

+5
source share
5 answers

When I want to save pages for offline use, I use a Firefox plugin called "Scrapbook". This, of course, does not allow the command line requirement to be met. But if you use a tool like "htmlunit" or something like that, you can launch the Firefox browser to go to the page you want to save.

+2
source

You can use curl or wget in combination with tidyhtml , i.e.

    curl http://stackoverflow.com > page.html
    tidy page.html > page_clean.html

Tidy should be able to convert any invalid HTML markup into valid XTML.

+1
source

- , , , firefox " " . script , firefox xdotools .

.

+1

Today I felt the need for something similar (and went the way xdotool). You can find my version (reusable bash script) at: https://github.com/abiyani/automate-save-page-as

+1
source

There is some sophisticated software that does just that: https://launchpad.net/shotfactory

0
source

All Articles