Using Headless FireFox to Save All HTML Files Using the Command Line on Linux

Using shell_exec with Xvfb and FireFox currently to capture screenshots. However, it is necessary to load the entire html (for example, Save the page as → Web page completed.) To the directory using shell_exec. Look at all the different options available on the Mozilla Developer Forums, but were unable to figure out how to do this.

I apparently need this code, but where and how is it implemented so that it can be accessed in shell_exec?

var file = Components.classes["@mozilla.org/file/local;1"] .createInstance(Components.interfaces.nsILocalFile); file.initWithPath("C:\\filename.html"); var wbp = Components.classes['@mozilla.org/embedding/browser/nsWebBrowserPersist;1'] .createInstance(Components.interfaces.nsIWebBrowserPersist); wbp.saveDocument(content.document, file, null, null, null, null); 

Source above code

 void saveDocument( in nsIDOMDocument aDocument, in nsISupports aFile, in nsISupports aDataPath, in string aOutputContentType, in unsigned long aEncodingFlags, in unsigned long aWrapColumn ); 

Source above code

There is a manual Stackoverflow solution here, but it does not address shell_exec: How to save a web page locally, including photos, etc.

+6
source share
1 answer

There are a few options that I know about, but none of what I know exactly matches your question.

  • Open firefox http://yoursite.com from the shell, then send keystrokes to firefox using xte or a similar method. (This is not a headless mode.)
  • Download with wget. It can work in a recursive manner. Or, alternately, you can parse HTML if it's a pretty simple web page. If you need to submit a form, use curl instead of wget.
  • Use the greasemonkey addon and write a script that will be uploaded to http://some-fake-page.com/?download=http://yoursite.com , and then open firefox with this fake page url.
  • Create your own Firefox addon to do the above job.

There may be other options for this, but I do not know them.

+1
source

All Articles