Remove unnecessary paragraph tags from the line
If I have a line like:
<p> </p> <p></p> <p class="a"><br /></p> <p class="b"> </p> <p>blah blah blah this is some real content</p> <p> </p> <p></p> <p class="a"><br /></p> How can I turn it into simple:
<p>blah blah blah this is some real content</p> The regular expression should select and spaces.
$ result = preg_replace ('# <p [^>] *> (\ s | & nbsp;?) * </p> #', '', $ input); This does not output literal nbsp characters on output, but it is very rarely seen.
Since you are dealing with HTML, if this is user input, I can suggest using HTML Purifier, which will also take into account XSS vulnerabilities. The configuration you want to remove with empty tags is% AutoFormat.RemoveEmpty.
As the original responder stated, regex isn't the best solution here, what you want is some kind of hpml stripper.
Function on this site: http://nadeausoftware.com/articles/2007/09/php_tip_how_strip_html_tags_web_page
To help you, you just need to manipulate the line a bit to get new lines and that does not return to the desired format.