What is the best way to sanitize rich html with rails?

I'm looking for tips on how to clear the submitted HTML in a web application so that it can be re-rendered in the future using styles or unclosed tags that destroy the layout of the application.

In my application, rich HTML is represented by users with the YUI Rich text editor, which by default runs several regular expressions to clear input, and I also call [ filter_MSWord][1] to catch any shit sent from the office

At the back end, I run ruby-tidy to disinfect html before displaying it as comments, but sometimes the poorly inserted html still affects the layout of the application I use - how can I protect against this?

FWIW here are the sanitizer settings that I use -

 module HTMLSanitizer def tidy_html(input) cleaned_html = Tidy.open(:show_warnings=>false) do |tidy| # don't output body and html tags tidy.options.show_body_only = true # output xhtml tidy.options.output_html = true # don't write newlines all over the place tidy.options.wrap = 0 # use utf8 to play nice with rails tidy.options.char_encoding = 'utf8' xml = tidy.clean(input) xml end end end 

What else are my options here?

+4
source share
2 answers

I personally use sanitized stone.

 require 'sanitize' op = Sanitize.clean("<html><body>wow!</body></hhhh>") # Notice the incorrect HTML. It still outputs "wow!" 
+8
source

I am using the sanitize helper available from ActionView

ActionView Module :: Helpers :: SanitizeHelper

+2
source

All Articles