Here is a long explanation why your code is not working.
The /g modifier changes the behavior of the regular expression to "global matching". This will match all occurrences of the pattern in the string. However, how this is done depends on the context. The two (main) contexts in Perl are the list context (plural) and the scalar context (singular).
In the context of the list, the global regular expression returns a list of all the substrings or a flat list of all matched captures:
my $_ = "foobaa"; my $regex = qr/[aeiou]/; my @matches = /$regex/g;
In a scalar context, a match seems to return a perl boolean expression associated with a regular expression:
my $match = /$regex/g; say $match;
However, the regular expression has turned into an iterator. Each time a regular expression is executed, the regular expression starts at the current position in the line and tries to match. If it matches, it returns true. If the match fails, then
- a match returns false, and
- the current position in the line is set to the beginning.
Since the line position was reset, the next match will be repeated again.
my $match; say $match while $match = /$regex/g; say "The match returned false, or the while loop would have go on forever"; say "But we can match again" if /$regex/g;
The second effect - resetting the position - can be canceled with the additional flag /c .
Access to the position in the string can be obtained using the pos function: pos($string) returns the current position, which can be set as pos($string) = 0 .
A regular expression can also be bound to the \G statement at the current position, just as ^ binds the regular expression at the beginning of a line.
This m//gc style correspondence makes it easy to write a tokenizer:
my @tokens; my $_ = "1, abc, 2 "; TOKEN: while(pos($_) < length($_)) { /\G\s+/gc and next;
Output:
[NUM 1] [COMMA] [STR abc] [COMMA] [NUM 2]