Perl - replacing sequences of identical characters

I am trying to implement a regex that, given a string, checks a sequence of at least 3 identical characters and replaces it with two characters. For example, I want to rotate the line below:

sstttttrrrrrrriing 

in

 ssttrriing 

I think of something like ...

 $string =~ s/(\D{3,})/substr($1, 0, 2)/e; 

But this will not work, because:

  • He does not verify the identity of the three alphabetic characters; it may correspond to a sequence of three or more different characters.
  • It replaces only the first match; I need to place all matches in this regex.

Can anybody help me?

+4
source share
2 answers

You can use the capture group and backlink with \1 , and then insert it twice after.

 $ perl -plwe 's/(.)\1{2,}/$1$1/g' sstttttrrrrrrriing ssttrriing 

Or you can use the \K (keep) escape sequence to avoid reinstallation.

 s/(.)\1\K\1+//g 

Replace the wildcard . for any suitable character (class), if necessary. For example, for letters:

 perl -plwe 's/(\pL)\1\K\1+//g' 
+12
source
 $ echo "sssssttttttrrrrriiiinnnnggg" | perl -pe "s/(.)\1+/\1\1/g" ssttrriinngg 
+3
source

Source: https://habr.com/ru/post/1414643/


All Articles