The essence of your problem, indicating a regular expression - a difference of one byte: q compared to qr . You write a regular expression, so to call it by what it is. Template processing in the form of a line means that you need to deal with the rules for quoting lines at the top of the rules for the displacement of regular expressions.
As for the language that matches your regular expression, add bindings to make the pattern match the entire string. Regex engine is rigidly determined and will continue to run until it finds a match. Without anchors he was happy to find a substring.
Sometimes it gives you unexpected results. Have you ever been treated with irritable child (or child-adult), which takes a narrow, extremely literal interpretation of what you're saying? Regular expressions - this way, but he's trying to help.
In the last example is appropriate because
- You said quantified
? That subpattern cred=... can coincide with zero time, so the regex mechanism missed it. - You said that the name of the script - this is the next substring that starts one or more characters without spaces, without a backward slash, so regex mechanism seen
cred=username/password , neither of which is not a space character or a backslash, and does not match, regular expressions are greedy: they consider a right to them, regardless of whether this substring "should" match the other subpattern.
The latest example for the account - but not the way you expected. An important lesson of regular expressions is any quantifier, such as ? or * which can coincide with the zero time, there will always be successful!
Without committing $ template of your question leaves the final backslash unsurpassed, you can see a slight modification of $runpat .
qr{run +(?:cred=(?:[^\s']*|\'.*\') +)?([^\s\\]+)(.*)};
Note the (.*) At the end to capture any newline characters, which can be left. Change cycle to
while (<DATA>) { next unless /$runpat/; print "line $.: \$1=[$1]; \$2=[$2]\n"; }
It gives the following output for a line 15.
line 15: $ 1 = [cred = username / password]; $ 2 = [\]
Like a complete program that becomes
#! /usr/bin/env perl use strict; use warnings; is a hack to #! /usr/bin/env perl use strict; use warnings; . #! /usr/bin/env perl use strict; use warnings; need it in your #! /usr/bin/env perl use strict; use warnings; s * run + (?: cred = (: [^ \ s'] * | \ '* \') +?.) ([^ \ s \\] +) $?} #! /usr/bin/env perl use strict; use warnings; in a different way #! /usr/bin/env perl use strict; use warnings; lines using \ #! /usr/bin/env perl use strict; use warnings;
Output:
line 2: $ 1 = [script.bi]
line 5: $ 1 = [script.bi]
line 8: 1 $ = [script.bi]
line 11: $ 1 = [script]
Compatibility is not always useful with regular expressions. Consider the following alternative but equivalent specification:
my $runpat = qr{ ^ \s* (?: run \s+ cred=(?:[^\s']*|'.*?') \s+ (?<script> [^\s\\]+) # ' hiliter | run \s+ (?!cred=) (?<script> [^\s\\]+) ) \s* $ }x; '\ s + (<script> [^ \ s \\] +?) # hiliter ([^ \ s'] * '*?.?)' my $runpat = qr{ ^ \s* (?: run \s+ cred=(?:[^\s']*|'.*?') \s+ (?<script> [^\s\\]+) # ' hiliter | run \s+ (?!cred=) (?<script> [^\s\\]+) ) \s* $ }x; =) (? <script> [^ \ s \\] +) my $runpat = qr{ ^ \s* (?: run \s+ cred=(?:[^\s']*|'.*?') \s+ (?<script> [^\s\\]+) # ' hiliter | run \s+ (?!cred=) (?<script> [^\s\\]+) ) \s* $ }x;
Yes, it takes more space to write, but it is more clear as acceptable alternatives. Your loop is pretty much the same
while (<DATA>) { next unless /$runpat/; print "line $.: script=[$+{script}]\n"; } $ + {script}] \ n"; while (<DATA>) { next unless /$runpat/; print "line $.: script=[$+{script}]\n"; }
and even eliminates the poor reader from having to count parentheses.
To use named capture buffers, for example, (?<script>...) , be sure to add
use 5.10.0;
at the top of your program to provide executable documentation on the minimum required version of perl.