Match regular expression with exact offset

I want to check if a specific pattern (e.g. double quote) matches the exact position.

Example

string text = "aaabbb"; Regex regex = new Regex("b+"); // Now match regex at exactly char 3 (offset) of text 

I would like to check if regex matches exactly char 3.
I looked at the Regex.Match Method (String, Int32) , but it does not behave as I expected.
So I did some tests and some workarounds:

 public void RegexTest2() { Match m; string text = "aaabbb"; int offset = 3; m = new Regex("^a+").Match(text, 0); // lets do a sanity check first Assert.AreEqual(true, m.Success); Assert.AreEqual("aaa", m.Value); // works as expected m = new Regex("^b+").Match(text, offset); Assert.AreEqual(false, m.Success); // this is quite strange... m = new Regex("^.{"+offset+"}(b+)").Match(text); // works, but is not very 'nice' Assert.AreEqual(true, m.Success); Assert.AreEqual("bbb", m.Groups[1].Value); m = new Regex("^b+").Match(text.Substring(offset)); // works too, but Assert.AreEqual(true, m.Success); Assert.AreEqual("bbb", m.Value); } 

In fact, I am starting to believe that new Regex("^.", 1).Match(myString) will never match anything.

Any suggestions?

Edit:

I got a working solution (workaround). So my question is about speed and good implementation.

+6
c # regex
source share
3 answers

Have you tried using docs ?

If you want to limit the coincidence that it starts with a certain character position in the line and the regular expression mechanism does not scan the rest of the line for matching, bind the regular expression to a \ G (from left to a from left to right or from right to left for a picture). This limits the match, so it should start exactly at startup.

i.e. replace ^ with \G :

 m = new Regex(@"\Gb+").Match(text, offset); Assert.AreEqual(true, m.Success); // should now work 
+8
source share

You expect Match(text, offset) to start evaluating the search string as if it started with an offset. This is not true. ^ will actually evaluate offset 0 , not offset !

So use a match overload that will evaluate ^ to offset :

 m = new Regex("^bbb$").Match(text, offset, text.Length-offset); 

another option will be used, but it is slower than the previous one:

 m = new Regex("^.{"+offset+"}bbb$").Match(text); 

or this (the first method is the fastest):

 m = new Regex(@"\Gbbb$").Match(text, offset); 
+1
source share

You can add the positive statement lookbehind ( (?<=...) ) to your regular expression:

 Regex regex = new Regex("(?<=\A.{3})b+"); 

This ensures that after the start of the line ( \A ) and before the start of the regular expression there are exactly three characters left ( .{3} ). You can also use ^ instead of \A , but since the former may also mean (in some cases) β€œMatch at the beginning of a line”, the latter is a bit more explicit.

You may need to compile the regex with RegexOptions.Singleline so that the dot also matches newlines if this is a requirement.

By the way

 m = new Regex("^b+").Match(text, 3); 

does not work, because ^ matches the beginning of the line, and the position before the first b , of course, is not at the beginning of the line.

0
source share

All Articles