Pure custom HTML in .net

My C # site allows users to submit HTML code that will be displayed on the site. I would like to limit the tags and attributes allowed for HTML, but cannot figure out how to do this in .net.

I tried using the Html Agility Pack , but I don’t see how to change the HTML, I see how to go through HTML and find certain data, but actually the output file is puzzling to me.

Does anyone have a good example for clearing HTML in .net? A flexibility package may be the answer, but the documentation is not enough.

+5
source share
6 answers

HtmlAgilityPack :

node.ParentNode.RemoveChild(node);
+2
+4

HTML.

LINQ to XML .

, .

:

//Maps allowed tags to allowed attributes for the tags.
static readonly Dictionary<string, string[]> AllowedTags = new Dictionary<string, string[]>(StringComparer.OrdinalIgnoreCase) {
    { "b",    new string[0] },
    { "img",  new string[] { "src", "alt" } },
    //...
};
static XElement CleanElement(XElement dirtyElement) {
    return new XElement(dirtyElem.Name,
        dirtyElement.Elements
            .Where(e => AllowedTags.ContainsKey(e.Name))
            .Select<XElement, XElement>(CleanElement)
            .Concat(
                dirtyElement.Attributes
                    .Where(a => AllowedTags[dirtyElem.Name].Contains(a.Name, StringComparer.OrdinalIgnoreCase))
            );
}

, javascript: urls; .

+3

, , SourceForge, SGMLReader, HTML XML XmlReader XmlDocument . -, HTML.

0

MarkdownSharp, Open Source ?

0

Refactor My Code http://refactormycode.com/codes/333-sanitize-html

I believe that StackOverflow combines this with the http://refactormycode.com/codes/360-balance-html-tags tag balancing code to disinfect messages and prepare them for display. And, of course, they use MarkdownSharp to include Markdown in messages.

0
source

All Articles