What you offer is difficult but doable .
If I can rephrase what I understand, you want to find out how far the bad match ended up in the match. To do this, you need to parse the regular expression.
The best regexp parser is probably to use Perl itself with the -re=debug command line:
$ perl -Mre=debug -e'"abcdefghijklmnopqr"=~/gh[ijkl]{5}/' Compiling REx "gh[ijkl]{5}" Final program: 1: EXACT <gh> (3) 3: CURLY {5,5} (16) 5: ANYOF[il][] (0) 16: END (0) anchored "gh" at 0 (checking anchored) minlen 7 Guessing start of match in sv for REx "gh[ijkl]{5}" against "abcdefghijklmnopqr" Found anchored substr "gh" at offset 6... Starting position does not contradict /^/m... Guessed: match at offset 6 Matching REx "gh[ijkl]{5}" against "ghijklmnopqr" 6 <bcdef> <ghijklmnop> | 1:EXACT <gh>(3) 8 <defgh> <ijklmnopqr> | 3:CURLY {5,5}(16) ANYOF[il][] can match 4 times out of 5... failed... Match failed Freeing REx: "gh[ijkl]{5}"
You can lay out this Perl command line with your regular expression and parse the return of stdout. Find `
Here is a suitable regex:
$ perl -Mre=debug -e'"abcdefghijklmnopqr"=~/gh[ijkl]{3}/' Compiling REx "gh[ijkl]{3}" Final program: 1: EXACT <gh> (3) 3: CURLY {3,3} (16) 5: ANYOF[il][] (0) 16: END (0) anchored "gh" at 0 (checking anchored) minlen 5 Guessing start of match in sv for REx "gh[ijkl]{3}" against "abcdefghijklmnopqr" Found anchored substr "gh" at offset 6... Starting position does not contradict /^/m... Guessed: match at offset 6 Matching REx "gh[ijkl]{3}" against "ghijklmnopqr" 6 <bcdef> <ghijklmnop> | 1:EXACT <gh>(3) 8 <defgh> <ijklmnopqr> | 3:CURLY {3,3}(16) ANYOF[il][] can match 3 times out of 3... 11 <ghijk> <lmnopqr> | 16: END(0) Match successful! Freeing REx: "gh[ijkl]{3}"
You will need to create a parser that can handle returns from the Perl re debugger. The left and right angle brackets show the distance to the line when the regex engine tries to combine.
This is not a simple btw project ...
source share