Best Practice: Custom HTML Cleanup

I am coding the width of the WYSIWYG designMode = "on" editor in an iframe. The editor works fine, and I save the code as is in the database.

Before issuing html, I need to "clear" php on the server side to avoid cross-site scripting and other scary things. Is there any best practice on how to do this? What tags can be dangerous?

UPDATE: Typo fixed, this is what you see, this is what you get. Nothing new:)

+6
javascript html php xss wysiwyg
source share
4 answers

Best practice is to allow only certain things that you know are not dangerous, and remove / remove everything else. See OWASP AntiSamy for Automatic Detection and Removal of Malicious Code on the Internet for a discussion of this issue (the library is for Java, but the principles apply for any language).

+5
source share

If you really tend to allow this, you should use the whitelist.

The best approach is probably to ban HTML and use a simplified markup format; you can pre-render HTML and save it in the database if performance is a problem. Avoiding such problems is one of the main reasons for using Markdown , Textile , reStructuredText , etc.

NOTE I am associated with the GitHub-Flavored Markdown (GFM), not the standard Markdown (SM). GFM solves some common problems that end users face with SM.

+3
source share

I recently looked at the same issue as Perl as a server language.

In doing so, I came across HTML Purifier , which may be what you want. But obviously, as in PHP, not Perl, I have not actually tested it.

In addition, in my research, I came to the conclusion that this is a very complex business, and think, if possible, using a simplified markup language such as Markdown, as suggested by Hank Gay.

+1
source share

If you are familiar with ASP.NET, just do Server.htmlencode () to convert special characters like <> to "& gt;" "& lt;"

In php you can use htmlspecialchars () functions.

Once special characters are encoded, cross-site scripting can be prevented.

0
source share

All Articles