Disable warnings when loading malformed HTML using DomDocument (PHP)

I need to parse some HTML files, however they do not have the correct form, and PHP prints warnings. I want to avoid this debugging / warning behavior programmatically. Please advise. Thank!

the code:

// create a DOM document and load the HTML data $xmlDoc = new DomDocument; // this dumps out the warnings $xmlDoc->loadHTML($fetchResult); 

It:

 @$xmlDoc->loadHTML($fetchResult) 

can suppress warnings, but how can I fix these warnings programmatically?

+62
html php warnings domdocument
Jul 19 '09 at 0:09
source share
3 answers

You can set a temporary error handler using set_error_handler

 class ErrorTrap { protected $callback; protected $errors = array(); function __construct($callback) { $this->callback = $callback; } function call() { $result = null; set_error_handler(array($this, 'onError')); try { $result = call_user_func_array($this->callback, func_get_args()); } catch (Exception $ex) { restore_error_handler(); throw $ex; } restore_error_handler(); return $result; } function onError($errno, $errstr, $errfile, $errline) { $this->errors[] = array($errno, $errstr, $errfile, $errline); } function ok() { return count($this->errors) === 0; } function errors() { return $this->errors; } } 

Using:

 // create a DOM document and load the HTML data $xmlDoc = new DomDocument(); $caller = new ErrorTrap(array($xmlDoc, 'loadHTML')); // this doesn't dump out any warnings $caller->call($fetchResult); if (!$caller->ok()) { var_dump($caller->errors()); } 
+15
Jul 19 '09 at 0:39
source share

Call

 libxml_use_internal_errors(true); 

before processing with $xmlDoc->loadHTML()

This tells libxml2 not to send errors and warnings through PHP. Then, to check for errors and handle them yourself, you can turn to libxml_get_last_error () and / or libxml_get_errors () when you are ready.

+178
May 17 '10 at 6:54 a.m.
source share

To hide warnings, you must give special libxml that are used internally for parsing:

 libxml_use_internal_errors(true); $dom->loadHTML($html); libxml_clear_errors(); 

libxml_use_internal_errors(true) indicates that you will handle errors and warnings yourself, and you do not want them to spoil the output of your script.

This is not the same as the @ operator. Warnings are collected behind the scenes, after which you can get them with libxml_get_errors() if you want to register or return a list of problems to the caller.

Regardless of whether you use collected warnings, you should always clear the queue by calling libxml_clear_errors() .

State preservation

If you have other code that uses libxml , you might libxml to make sure that your code does not change the global error handling state; for this you can use the return value of libxml_use_internal_errors() to save the previous state.

 // modify state $libxml_previous_state = libxml_use_internal_errors(true); // parse $dom->loadHTML($html); // handle errors libxml_clear_errors(); // restore libxml_use_internal_errors($libxml_previous_state); 
+70
Jul 09 '13 at 22:59
source share



All Articles