Export a specific item in a DOMDocument to a string

I am importing some arbitrary HTML into a DOMDocument using the loadHTML() function, for example:

 $html = '<p><a href="test.php">Test</a></p>'; $doc = new DOMDocument; $doc->loadHTML($html); 

Then I want to change some attribute / node values ​​using the DOMDocument methods, which I can do without any problems.

Once I made these changes, I would like to export the HTML string (using ->saveHTML() ), without the <html><body>... DOMDocument that the DOMDocument automatically adds to the HTML.

I understand why they were added (to provide a valid document), but how would I just want to return my edited HTML (essentially everything between the <body> tags)?

I read this post , and while it offers some solutions, I would prefer to do it “correctly”, that is, without using string replacement in the <body> tags. HTML validity is not a problem as it goes through an HTML cleaner before starting work.

Any ideas? Thanks.

EDIT

I am aware of the $node parameter added to saveHTML() in PHP 5.3.6, unfortunately I am stuck with 5.2.

+8
html php domdocument
source share
3 answers

Perhaps the source code will help this - they use a regular expression to highlight unnecessary lines:

http://beerpla.net/projects/smartdomdocument-a-smarter-php-domdocument-class/

 $content = preg_replace(array("/^\<\!DOCTYPE.*?<html><body>/si", "!</body></html>$!si"), "", $this->saveHTML()); return $content; 

saveHTMLExact () - DOMDocument has an extremely poorly designed "function", where if the HTML you are loading does not contain <html> and <body> tags, it automatically adds them (yup, flags are not disabled).

Thus, when you call $ doc-> saveHTML (), your recently saved content now has <html><body> and DOCTYPE . Not very convenient when trying to work with code snippets (XML has a similar problem).

SmartDOMDocument contains a new saveHTMLExact () function that does exactly what you want - it saves HTML without adding the extra garbage that DOMDocument does.

In addition, other questions asked similar things:

How to save an HTML DOMDocument without an HTML wrapper?

+4
source share

Try using DOMDocument-> saveXML ()?

 <?php $html = '<p><a href="test.php">Test</a></p>'; $doc = new DOMDocument(); $doc->loadHTML($html); $domnodelist = $doc->getElementsByTagName('p'); $domnode = $domnodelist->item(0); echo $doc->saveXML($domnode); ?> 

It displays the <p><a href="test.php">Test</a></p>

+2
source share

Thank you, but I will not necessarily know the type of the first tag in the body, it must be general

 $domnodelist = $doc->getElementsByTagName('*'); $domnode = $domnodelist->item(0); echo $doc->saveXML($domnode); 
-one
source share

All Articles