Regex to remove empty html tags containing only empty child elements

Question

Regex to remove empty html tags containing only empty child elements

I need to parse an HTML string and remove all elements containing only empty child elements.

Example:

<P ALIGN="left"><FONT FACE="Arial" SIZE="12" COLOR="#000000" LETTERSPACING="0" KERNING="1"><B></B></FONT></P>

does not contain information and must be replaced by </br>

I wrote a regular expression like this:

 <\w+\b[^>]*>(<\w+\b[^>]*>\s*</\w*\s*>)*</\w*\s*>

but the problem is that he catches only 2 levels out of three. In the abobe example, the <p> element (external external) is not selected.

Can you help me fix this regex?

+1

javascript jquery html regex

Cristian holdunu Nov 13 '13 at 10:26

source share

2 answers

Use jQuery and parse all the children. For each child, you should check if .html () is empty. If yes → delete the current item (or parent, if you want) with .remove ().

Do for each line:

 var appended = $('.yourparent').append('YOUR HTML STRING'); appended.children().each(function () { if(this.html() === '') { this.parent().remove(); } });

This will add items first and delete if there are empty children.

+2

Philipp Möhler Nov 13 '13 at 10:32

source share

Bohemian · Accepted Answer · 2013-11-13T12:55:35+0000

This regex works:

 /(<(?!\/)[^>]+>)+(<\/[^>]+>)+/

See a live demo with your example.

Regex to remove empty html tags containing only empty child elements

More articles: