Perl Regex - Get the offset of all matches instead of one

I want to find a file for a string, and then get the offsets for all matches. The contents of the file are as follows:

sometext sometext AAA sometext AAA AAA sometext 

I read the whole file in the line $text , and then execute the regex for AAA as follows:

 if($text =~ m/AAA/g) { $offset = $-[0]; } 

This will give an offset of only one AAA . How can I get the offset from all matches?

I know that we can get all matches in an array using syntax like this:

my @matches = ($text =~ m/AAA/g);

But I need a biased string without matching.

I am currently using the following code to get the offsets of all matches:

 my $text= "sometextAAAsometextAAA"; my $regex = 'AAA'; my @matches = (); while ($text =~ /($regex)/gi){ my $match = $1; my $length = length($&); my $pos = length($`); my $start = $pos + 1; my $end = $pos + $length; my $hitpos = "$start-$end"; push @matches, "$match found at $hitpos "; } print "$_\n" foreach @matches; 

But is there an easier way?

+4
source share
2 answers

I don't think there is a built-in way to do this in Perl. But from How to find the location of a regex in Perl? :

 sub match_all_positions { my ($regex, $string) = @_; my @ret; while ($string =~ /$regex/g) { push @ret, [ $-[0], $+[0] ]; } return @ret } 
+1
source

You already know that you should use $-[0] ! Replace

 while ($text =~ /($regex)/gi){ my $match = $1; my $length = length($&); my $pos = length($`); my $start = $pos + 1; my $end = $pos + $length; my $hitpos = "$start-$end"; push @matches, "$match found at $hitpos "; } 

from

 while ($text =~ /($regex)/gi){ push @matches, "$1 found at $-[0]"; } 

However, I am a big fan of the computing branch from formatting output, so I would do

 while ($text =~ /($regex)/gi){ push @matches, [ $1, $-[0] ]; } 

PS - If you did not expand the while loop, if (/.../g) does not make sense. In the best case, /g does nothing. In the worst case, you get the wrong results.

+3
source

All Articles