Regular expression to convert label down to HTML

How would you write a regex to convert labels to HTML? For example, you should enter the following:

This would be *italicized* text and this would be **bold** text 

Then it needs to be converted to:

 This would be <em>italicized</em> text and this would be <strong>bold</strong> text 

Very similar to the down label editing control used by stackoverflow.

Explanation

For what it's worth, I'm using C #. Also, these are the only tags I want to allow. The amount of convertible text will be less than 300 characters.

+6
html c # regex markdown
source share
5 answers

The best way is to find the version of the Markdown library ported to any language you use (you did not specify in your question).


Now that you have clarified that you want STRONG and EM to be processed, and that you are using C #, I recommend that you take a look at Markdown. NET to see how these tags are implemented. As you can see, these are actually two expressions. Here is the code:

 private string DoItalicsAndBold (string text) { // <strong> must go first: text = Regex.Replace (text, @"(\*\*|__) (?=\S) (.+?[*_]*) (?<=\S) \1", new MatchEvaluator (BoldEvaluator), RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline); // Then <em>: text = Regex.Replace (text, @"(\*|_) (?=\S) (.+?) (?<=\S) \1", new MatchEvaluator (ItalicsEvaluator), RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline); return text; } private string ItalicsEvaluator (Match match) { return string.Format ("<em>{0}</em>", match.Groups[2].Value); } private string BoldEvaluator (Match match) { return string.Format ("<strong>{0}</strong>", match.Groups[2].Value); } 
+7
source share

One regex will not do. Each text markup will have its own html translator. Better see how existing converters are implemented to get an idea of ​​how this works.

http://en.wikipedia.org/wiki/Markdown#See_also

+5
source share

I don’t know about C # specifically, but in perl it will be:
s /
\ * \ * (. *?) \ * \ * /
\ & L; bold> $ 1 \ </ bold> / g
s /
\ * (. *?) \ * /
\ & L; EM> $ 1 \ </EM> / g

+1
source share

Check out this implementation of StackOverflow:

MarkdownSharp - Markdown for C #

+1
source share

I came across the following post which recommends not doing this. In my case, although I am looking for it to be simple, I thought I would send this to the jop recommendation in case someone wants to do this.

0
source share

All Articles