Retrieving Groups and Subgroups in RegEx

This question is, in a way, a continuation of my earlier answer to the question: Setting "Unterminated []." Error in C #

I use regex in C # to retrieve urls:

Regex find = new Regex(@"(?<First>[,""]url=)(?<Url>[^\\]+)(?<Last>\\u00)"); 

If the text contains URLs in the format:

url = http://domain.com?itag=25 \ u0026, url = http://hello.com?itag=11 \ u0026

I get the entire URL in the Url group, but I would also like to have the itag value in the separate iTag group. I know that this can be done using subgroups, and I tried, but I can’t figure out how to do it.

+4
source share
1 answer

You already have the name groups defined in Regex. The syntax ?<First> denotes everything in First brackets.

When you use Regex , use the Groups property to access the GroupCollection and retrieve the group value by name.

 var first = regex.Match(line).Groups["First"].Value; 

This will add an additional group for iTag, but retain the full URL. Move it outside of the other brackets to change this.

 (?<First>[,""]url=)(?<Url>[^\?]+?itag=(?<iTag>[0-9]*))(?<Last>\\u0026) 

Here is the code.

 Regex regex = new Regex("(?<First>[,\"]url=)(?<Url>[^\\?]*\\?itag=(?<iTag>[0-9]*))(?<Last>\\u0026)"); string input = ",url=http://domain.com?itag=25\u0026,url=http://hello.com?itag=11\u0026"; foreach(Match match in regex.Matches(input)) { System.Console.WriteLine("1. "+match); System.Console.WriteLine(" 1. "+match.Groups["First"]); System.Console.WriteLine(" 2. "+match.Groups["Url"]); System.Console.WriteLine(" 3. "+match.Groups["iTag"]); System.Console.WriteLine(" 4. "+match.Groups["Last"]); } 

Results:

 1. ,url=http://domain.com?itag=25& 1. ,url= 2. http://domain.com?itag=25 3. 25 4. & 1. ,url=http://hello.com?itag=11& 1. ,url= 2. http://hello.com?itag=11 3. 11 4. & 
+4
source

All Articles