Escape all HTML except

I am trying to display comments on a page and have some problems.

There are two different types of comments I'm trying to process:

(1) Type XSS. <script type="text/javascript">alert('hi')</script>. This is pretty easy, avoiding it before it enters the database and then running stripslashes and htmlentities on it.

(2) It breaks the comment with <br>. When the data is stored in the database, I run nl2br, so the data looks like hi<br>hello<br><br>etc. However, when I show this comment, <br>they do not turn into page breaks as I want them.

Any idea what to do? I should note that disabling htmlentities captures the second type, but the first type runs as pure html and displays a warning dialog.

Thanks Phil

+5
source share
2 answers

One method: Replace <br>with a placeholder, for example \n. Then do htmlentities to clear the html code. Finally, replace \nwith <br>to restore line breaks.

+2
source

If you want to remove unnecessary tags, you can try strip_tags. It supports allowable_tags, so you can specify any tags that you do not want to remove. Sample from manual :

// Allow <p> and <a>
// you can add <br> if you want it not stripped
echo strip_tags($text, '<p><a>');

So, after you have converted everything \nto line breaks, you need not worry about being stripped. Maybe not what you want, but hope that this gives an idea.

+11