How can I take a snapshot of the structure of the DOM page page?

I need to compare the DOM structure of a web page at different points. What are the ways to extract and snapshot.

I need a server-side DOM to handle.

I basically need to track structural changes on a web page. For example, deleting a div tag or inserting a p tag. Changing data (innerHTML) to those tags should not be considered a difference.

+4
source share
3 answers

Follow these steps on the server side:

  • Get snapshot of webpage via HTTP GET
  • Save successive snapshots of the page with different names for later comparison.
  • Compare the files with the diff tool compatible with HTML (see the htmlDiff tool list page on the ESW wiki page ).

As an example of proof of concept with the Linux shell, you can perform this comparison as follows:

wget --output-document=snapshot1.html http://example.com/ wget --output-document=snapshot2.html http://example.com/ diff snapshot1.html snapshot2.html 

You can, of course, wrap these commands in a server program or script.

For PHP, I would suggest you take a look at daisydiff-php . It easily provides a PHP class that allows you to easily create an HTML markup tool. Example:

 <? require_once('HTMLDiff.php'); $file1 = file_get_contents('snapshot1.html'); $file2 = file_get_contents('snapshot1.html'); HTMLDiffer->htmlDiffer( $file1, $file2 ); ?> 

Note that with file_get_contents you can also retrieve data from a given URL.

Note that DaisyDiff itself is a very subtle tool for visualizing structural changes as well .

+2
source
 $html_page = file_get_contents("http://awesomesite.com"); $html_dom = new DOMDocument(); $html_dom->loadHTML($html_page); 

This uses the PHP DOM. Very simple and actually a little fun to use. Link

EDIT: after finding out, the best answer lies here .

+4
source

If you use firefox, firebug allows you to view the DOM structure on any web page.

+1
source

All Articles