How to remove empty html tags (containing spaces and / or their html codes)

A regular expression is required for preg_replace.

“Another question” did not answer this question, because not all the tags that I want to remove are not empty.

I need not only to remove empty tags from the HTML structure, but also tags containing line breaks, as well as spaces and / or their html code.

Possible codes:

</ amp; NBSP; & Amp; thinsp; & Amp; EnSP; & Amp; EPRS; & Amp; # 8201; & Amp; # 8194; & Amp; # 8195;

BEFORE deleting matching tags:

<div> <h1>This is a html structure.</h1> <p>This is not empty.</p> <p></p> <p><br /></p> <p> <br /> &;thinsp;</p> <p>&nbsp;</p> <p> &nbsp; </p> </div> 

AFTER removing matching tags:

 <div> <h1>This is a html structure.</h1> <p>This is not empty.</p> </div> 
+3
source share
3 answers

You can use the following:

 <([^>\s]+)[^>]*>(?:\s*(?:<br \/>|&nbsp;|&thinsp;|&ensp;|&emsp;|&#8201;|&#8194;|&#8195;)\s*)*<\/\1> 

And replace with '' (empty string)

See DEMO

Note. . This will also work for empty html tags with attributes.

+4
source

Use tidy It uses the following function:

 function cleaning($string, $tidyConfig = null) { $out = array (); $config = array ( 'indent' => true, 'show-body-only' => false, 'clean' => true, 'output-xhtml' => true, 'preserve-entities' => true ); if ($tidyConfig == null) { $tidyConfig = &$config; } $tidy = new tidy (); $out ['full'] = $tidy->repairString ( $string, $tidyConfig, 'UTF8' ); unset ( $tidy ); unset ( $tidyConfig ); $out ['body'] = preg_replace ( "/.*<body[^>]*>|<\/body>.*/si", "", $out ['full'] ); $out ['style'] = '<style type="text/css">' . preg_replace ( "/.*<style[^>]*>|<\/style>.*/si", "", $out ['full'] ) . '</style>'; return ($out); } 
+1
source

I'm not so good with but try this

 \<.*\>\s*\&.*sp;\s*\<\/.*\>|\<.*\>\s*\<\s*br\s*\/\>\s*\&.*sp;\s*\<\/.*\>|\<.*\>\s*\&.*sp;\s*\<\s*br\s*\/\>\<\/.*\> 

Basically matches

  • Tags with HTML space elements in them OR
  • Tags with breaks before HTML space elements in them
  • Tags with gaps that occur after HTML space elements in them
0
source

All Articles