How can I use lookbehind in C # Regex to skip matches of repeated prefix patterns?

How can I use lookbehind in C # Regex to skip matches of repeated prefix patterns?

Example. I am trying to make the expression match all b characters following any number of a characters:

 Regex expression = new Regex("(?<=a).*"); foreach (Match result in expression.Matches("aaabbbb")) MessageBox.Show(result.Value); 

returns aabbbb , lookbehind matches only a . How can I make it so that it matches all a at the beginning?

I tried

 Regex expression = new Regex("(?<=a+).*"); 

and

 Regex expression = new Regex("(?<=a)+.*"); 

without results ...

What I expect is bbbb .

+6
c # regex lookbehind
source share
3 answers

Are you looking for a re-capture group?

 (.)\1* 

This will return two matches.

Given:

 aaabbbb 

This will lead to:

 aaa bbbb 

It:

 (?<=(.))(?!\1).* 

Uses the above principle, first verifying that the search for the previous character is by capturing it in the backward link, and then claiming that this character is not the next character.

It corresponds:

 bbbb 
+6
source share

In the end, I realized:

 Regex expression = new Regex("(?<=a+)[^a]+"); foreach (Match result in expression.Matches(@"aaabbbb")) MessageBox.Show(result.Value); 

I must not let a me fit the non-lookbehind group. Thus, the expression will correspond only to those b repetitions that follow a repetition.

Matching aaabbbb gives bbbb and matching aaabbbbcccbbbbaaaaaabbzzabbb leads to bbbbcccbbbb , bbzz and bbb .

+3
source share

The reason that β€œa” is peeping is because it consumes the first β€œa” (but does not capture it), then it captures the rest.

Will this template work for you? New pattern: \ba+(.+)\b It uses the word boundary \b to anchor both ends of the word. It corresponds to at least one β€œa”, followed by the remaining characters until the end of the word boundary. The remaining characters are written to the group, so you can easily refer to them.

 string pattern = @"\ba+(.+)\b"; foreach (Match m in Regex.Matches("aaabbbb", pattern)) { Console.WriteLine("Match: " + m.Value); Console.WriteLine("Group capture: " + m.Groups[1].Value); } 

UPDATE:. If you want to skip the first occurrence of any duplicated letters, then match the rest of the line, you can do this:

 string pattern = @"\b(.)(\1)*(?<Content>.+)\b"; foreach (Match m in Regex.Matches("aaabbbb", pattern)) { Console.WriteLine("Match: " + m.Value); Console.WriteLine("Group capture: " + m.Groups["Content"].Value); } 
+1
source share

All Articles