XML Gas Comments

I was faced with the need to remove form comments:

<!--  Foo

      Bar  -->

I would like to use a regular expression that matches any (including line breaks) between the start and end delimiters.

What would be a good regex for this task?

+3
source share
5 answers

A simple way:

Regex xmlCommentsRegex = new Regex("<!--.*?-->", RegexOptions.Singleline | RegexOptions.Compiled);

And the best way:

Regex xmlCommentsRegex = new Regex("<!--(?:[^-]|-(?!->))*-->", RegexOptions.Singleline | RegexOptions.Compiled);
+5
source

NONE It cannot be described using contextual free grammar, on the basis of which a regular expression is determined.

Suppose this stream is exported to XML. Your example (<! - FOO Bar β†’), if included in CDATA, will be lost, although this is not quite a comment.

+6

"" - XSLT , .

+4

Parsing XML with regex is considered bad. Use some XML parsing library.

0
source

Here are some sample code for reading an XML file and return a string that is a file without comments.

var text = File.ReadAllText("c:\file.xml");
{ 
  const string strRegex = @"<!--(?:[^-]|-(?!->))*-->";
  const RegexOptions myRegexOptions = RegexOptions.Multiline;
  Regex myRegex = new Regex(strRegex, myRegexOptions);
  string strTargetString = text;
  const string strReplace = @""; 

  string result = myRegex.Replace(strTargetString, strReplace);
  return result;
}

Unfortunately, it just RegexOptions.Multilinewon’t perform the trick (which is a little contrary to intuition).

0
source

All Articles