Iterative capture of regular expressions in C #

I need to read in a file that contains several coordinates. The file is structured as follows:

X1/Y1,X2/Y2,X3/Y3,X4/Y4 

Where X and Y are positive integers. To solve this problem, I want to use a regex (I think this is generally a good idea due to minimal refactoring when changing the template).

Therefore, I developed the following regular expression:

 Regex r = new Regex(@^(?<Coor>(?<X>[0-9]+)/(?<Y>[0-9]+))(,(?<Coor>(?<X>[0-9]+)/(?<Y>[0-9]+)))*$"); 

However, when I test this regular expression for data, for example:

 1302/1425,1917/2010 

Regex only seems to recall the last bands X, Y and Coor. In this case, Coor is โ€œ12/17โ€, X is โ€œ1917โ€, and โ€œYโ€ is โ€œ2010โ€. Is there a way to generate some kind of tree. So I find an object that gives me all the Coor expressions, where is there an X and Y component under each Coor?

If possible, I would like to use only one Regex, this is because the format may change to another.

+4
source share
4 answers

You can easily solve this without any regex using string.Split and int.Parse :

 var coords = s.Split(',') .Select(x => x.Split('/')) .Select(a => new { X = int.Parse(a[0]), Y = int.Parse(a[1]) }); 

If you want to use a regular expression to validate a string, you can do it like this:

 "^(?!,)(?:(?:^|,)[0-9]+/[0-9]+)*$" 

If you want to use the regex approach as well to extract the data, you can first check the string using the regex above and then add the following data:

 var coords = Regex.Matches(s, "([0-9]+)/([0-9]+)") .Cast<Match>() .Select(match => new { X = int.Parse(match.Groups[1].Value), Y = int.Parse(match.Groups[2].Value) }); 

If you really want to test and retrieve data at the same time as one regular expression, you can use two capture groups and find the results in the Captures property for each group. Here, one way you could validate and retrieve data using a single regular expression:

 List<Group> groups = Regex.Matches(s, "^(?!,)(?:(?:^|,)([0-9]+)/([0-9]+))*$") .Cast<Match>().First() .Groups.Cast<Group>().Skip(1) .ToList(); var coords = Enumerable.Range(0, groups[0].Captures.Count) .Select(i => new { X = int.Parse(groups[0].Captures[i]), Y = int.Parse(groups[1].Captures[i]) }); 

However, you might be wondering if the complexity of this solution is compared to a solution based on string.Split .

+5
source

It makes no sense to use a regular expression for such a simple format.

Just split the line and use simple line operations to get the coordinates:

 var coordinates = fileContent.Split(',').Select(s => { int pos = s.IndexOf("/"); return new { X = s.Substring(0, pos), Y = s.Substring(pos + 1) }; }); 

If the file format gets a lot more complicated, you can reorganize it into a regular expression. Until then, simple code like this is much easier to maintain.

+3
source

You can get what you are looking for if you use the matches command, not the match command. Also, you cannot shorten the regex, perhaps before that:

 Regex(@"((?<Coor>(?<X>[0-9]+)/(?<Y>[0-9]+))|,)*"); 
+2
source

I think your first problem is that your regex has flaws, anchors drop matching. This is the one I came across: (only the expression shown here, no code)

(?<Coor>(?<X>[0-9]+)/(?<Y>[0-9]+))

One Mystagogue works as well, but produces โ€œemptyโ€ comma matches (for me).

+1
source

All Articles