How to erase tags is safer than using the strip_tags function?

I am having problems using the PHP strip_tags function when a string contains less and more characters. For example:

If I do this:

strip_tags("<span>some text <5ml and then >10ml some text </span>"); 

I will get:

 some text 10ml some text 

But obviously I want to get:

 some text <5ml and then >10ml some text 

Yes, I know I can use & lt; and & gt; but I don’t have the ability to convert these characters to HTML objects, since the data is already saved, as you can see in my example.

What I'm looking for is a smart way to parse HTML to get rid of real HTML tags only.

Since TinyMCE was used to generate this data, I know what actual html tags can be used anyway, so implementing strip_tags($string, $black_list) would be more useful than strip_tags($string, $allowable_tags) .

Anyone though?

+7
source share
3 answers

As a wacky workaround, you can filter the brackets without html with:

 $html = preg_replace("# <(?![/az]) | (?<=\s)>(?![az]) #exi", "htmlentities('$0')", $html); 

Apply strip_tags () afterwards. Note how this only works for your specific example and similar cases. This is a regular expression with some heuristic rather than artificial intelligence to distinguish html tags from irreversible angle brackets with a different value.

+6
source

If you want to have more than and less than signs, you need to avoid them:

&gt; is>

&lt; is <

See this: http://www.w3schools.com/html/html_entities.asp

+4
source

Instead of strip_tags (), use htmlspecialchars () instead.

http://php.net/manual/en/function.htmlspecialchars.php

+2
source

All Articles