Short answer
There is no such thing as a "position in a string that causes a regular expression to fail."
However, I will show you the approach to answer the reverse question:
In which regex token made the engine unable to match the string?
Discussion
In my opinion, the question of the position in the string which caused the regular expression to fail upside down. When the engine moves down the line with the left hand and the drawing with the right hand, the regex token that matches six characters at a time can later be reduced due to quantifiers and going backward to match the next zero characters or extended to match 10.
In my opinion, a more correct question:
In which regex token made the engine unable to match the string?
For example, consider the regular expression ^\w+\d+$ and the string abc132z .
\w+ can actually match the whole line. However, all regex fails. Does it make sense to say that the regex doesn't work at the end of the line? I do not think so. Consider this.
Initially, \w+ will match abc132z . Then the engine goes to the next token: \d+ . At this point, the engine returns to the string, gradually allowing \w+ to abandon 2z (so \w+ now only abc13 ), allowing \d+ match 2 .
At this point, the statement $ fails because z remains. The engine backs off, allowing \w+ to drop the character 3 , then 1 (so that \w+ now matches only abc ), eventually allowing \d+ match 132 At each step, the engine tries to execute the $ statement and fails. Depending on the internal parts of the engine, a larger shutdown may occur: \d+ will again reset 2 and 3, then \w+ will refuse c and b. When the engine finally surrenders, \w+ matches only the initial a . Can you say that the regular expression "doesn't work" to "3"? On the "b"?
Not. If you look at the regex pattern from left to right, you can argue that it fails on $ , because this is the first token that we could not add to the match. Keep in mind that there are other ways to argue for this.
Lower, I will give you a screenshot to visualize this. But first, let's see if we can answer another question.
Another question
Are there methods to answer another question:
In which regex token made the engine unable to match the string?
It depends on your regular expression. If you can slice a regular expression into pure components, you can create an expression using a series of optional lookaheads inside capture groups, which allows a match to always succeed. The first capture release group is the one that caused the failure.
Javascript is a bit stingy with optional lookaheads, but you can write something like this:
^(?:(?=(\w+)))?(?:(?=(\w+\d+)))?(?:(?=(\w+\d+$)))?.
In PCRE, .NET, Python ... you can write this more compactly:
^(?=(\w+))?(?=(\w+\d+))?(?=(\w+\d+$))?.
What's going on here? Each lookahead sequentially builds on the latter, adding one token at a time. Therefore, we can test each token separately. The point at the end is additional prosperity for visual feedback: we can see in the debugger that at least one character is matched, but we do not care about this character, we only care about capture groups.
- Group 1 tests the token
\w+ - Group 2 seems to be testing
\w+\d+ , so gradually it checks the token \d+ - Group 3 seems to be testing
\w+\d+$ , therefore, gradually, it is testing the $ token
There are three capture groups. If all three are given, the match is a complete success. Unless group 3 is set (as in abc123a ), you can say that $ caused a crash. If group 1 is selected, but not group 2 (like abc ), you can say that \d+ failed.
For reference: Internal view of the failure path
For what it's worth, here is an introduction to the rejection of the RegexBuddy debugger.
