Regex captures an extra character

I am using PHP preg_replace with the following regex:

/(?<=#EXTINF:([0-9])+,).+?(?=#EXT)/gsm

works with the following line:

 #EXTM3U #EXT-X-TARGETDURATION:10 #EXTINF:10, Grab_this_string #EXTINF:5, Grab_this_string_too #EXT-X-ENDLIST 

This replaces:

 , Grab_this_string Grab_this_string_too 

I am trying to match it without the first comma (essentially everything that is between #EXTINF:xx, and the following #EXTINF :

 Grab_this_string Grab_this_string_too 
+4
source share
1 answer

Since you are in multi-line mode, you can match lines to outline each line.

This corresponds to two lines and replaces them only with the first line (actually deleting the second line). Notice that I removed the "dotall" ( s ) mode.

 $regex = '/(^#EXTINF:\d+,$)(\s+)^.+$(?=\s+^#EXT)/m'; echo preg_replace($regex, '$1', $str); 

Output:

 #EXTM3U #EXT-X-TARGETDURATION:10 #EXTINF:10, #EXTINF:5, #EXT-X-ENDLIST 

Update:

Using lookbehind will not work as it requires variable length matching, which is not supported on most regex engines (including PCRE, which uses PHP).

If you want to capture only the line that you want to delete, and you do not need to replace the two lines with a matching subquery, as it was above, you can use the \K escape sequence to simulate lookbehind, which is not subject to the variable -long restrictions. \K resets the starting position of the match, so anything that was matched before \K will not be included in the final match. (See the last paragraph here .)

 $regex = '/^#EXTINF:\d+,\s+\K^.+?(?=#EXT)/sm'; echo preg_replace($regex, '', $str); 
+2
source

All Articles