You do not need an HTML cleaner. The DOMDocument class takes care of everything for you. However, this will result in an invalid html warning, so just do the following:
$doc = new DOMDocument(); @$doc->loadHTML($content);
Then the error will not be triggered, and you can do what you want using HTML.
If you clear links, I would recommend that you use SimpleXMLElement :: xpath (); This is much easier than working with DOMDocument. Another example:
$xml = new SimpleXMLElement($content); $result = $xml->xpath('a/@href'); print_r($result);
You can get much more complex xpaths that allow you to specify class names, identifiers, and other attributes. This is much more powerful than DOMDocument.
source share