I dynamically generate regular expressions by skipping some xml structure and creating an instruction when I run its node types. I use this regex as part of the layout type that I defined. Then I parse the text file with the identifier at the beginning of each line. This identifier points me to a specific layout. Then I try to match the data in this line with its regular expression.
Sounds ok and dandy? The only problem is that the matching strings are very slow. I set them as compiled to try to speed things up a bit, but to no avail. What is puzzling is that these expressions are not so complex. I'm not a RegEx guru, but I know a decent amount about them so that everything is fine.
Here is the code that generates the expressions ...
StringBuilder sb = new StringBuilder();
sb.Append(@"^([0-9]+)[ \t]{1,2}([0-9]+)");
foreach (ColumnDef c in columns)
{
sb.Append(@"[ \t]{1,2}");
switch (c.Variable.PrimType)
{
case PrimitiveType.BIT:
sb.Append("(0|1)");
break;
case PrimitiveType.DATE:
sb.Append(@"([0-9]{2}/[0-9]{2}/[0-9]{4})");
break;
case PrimitiveType.FLOAT:
sb.Append(@"([-+]?[0-9]*\.?[0-9]+)");
break;
case PrimitiveType.INTEGER:
sb.Append(@"([0-9]+)");
break;
case PrimitiveType.STRING:
sb.Append(@"([a-zA-Z0-9]*)");
break;
}
}
sb.Append("$");
_pattern = new Regex(sb.ToString(), RegexOptions.Compiled);
The actual slow part ...
public System.Text.RegularExpressions.Match Match(string input)
{
if (input == null)
throw new ArgumentNullException("input");
return _pattern.Match(input);
}
A typical "_pattern" may have about 40-50 columns. I will save from inserting the whole template. I am trying to group each case so that I can list each case in the Match object later.
Any tips or modifications that could be of significant help? Or is it a slow wait?
EDIT FOR CLARITY: Sorry, I don't think I was clear enough for the first time.
XML . . , , , , . , , , .