PHP: regex to ignore hidden quotes in quotes

I looked through related questions before posting this, and I was not able to change any relevant answers to working with my method (not good in regular expression).

Basically, here are my existing lines:

$code = preg_replace_callback( '/"(.*?)"/', array( &$this, '_getPHPString' ), $code ); $code = preg_replace_callback( "#'(.*?)'#", array( &$this, '_getPHPString' ), $code ); 

Both of them correspond to the lines contained between '' and "" . I need a regular expression to ignore hidden quotes contained among themselves. Thus, the data between the '' will ignore \' , and the data between the "" will ignore \" .

Any help would be greatly appreciated.

+26
php regex
Apr 17 2018-11-17T00:
source share
5 answers

For most strings, you need to allow something escaped (not just escaped quotes). for example, you most likely need to allow escaped characters such as "\n" and "\t" , and, of course, escape-escape: "\\" .

This is a frequently asked question, and one that has been resolved (and optimized) a long time ago. Jeffrey Friedl examines this issue in detail (as an example) in his classic work: Mastering Regular Expressions (3rd edition) . Here is the regex you are looking for:

Good:

"([^"\\]|\\.)*"
Version 1: works correctly, but not very efficient.

Better:

"([^"\\]++|\\.)*" or "((?>[^"\\]+)|\\.)*"
Version 2: more effective if you have possessive quantifiers or atomic groups (see "The Correct Answer to Sin," which uses the atomic group method).

Best:

"[^"\\]*(?:\\.[^"\\]*)*"
Version 3: even more efficient. Implements the Friedl technique: "loop unrolling". It does not require possessive or atomic groups (i.e. It can be used in Javascript and other less functional regular expression engines.)

Here are the recommended regular expressions in PHP syntax for double and single quotes:

 $re_dq = '/"[^"\\\\]*(?:\\\\.[^"\\\\]*)*"/s'; $re_sq = "/'[^'\\\\]*(?:\\\\.[^'\\\\]*)*'/s"; 
+64
Apr 17 '11 at 20:13
source share

Try regex:

 '/"(\\\\[\\\\"]|[^\\\\"])*"/' 

A (short) explanation:

 " # match a `"` ( # open group 1 \\\\[\\\\"] # match either `\\` or `\"` | # OR [^\\\\"] # match any char other than `\` and `"` )* # close group 1, and repeat it zero or more times " # match a `"` 

The following snippet:

 <?php $text = 'abc "string \\\\ \\" literal" def'; preg_match_all('/"(\\\\[\\\\"]|[^\\\\"])*"/', $text, $matches); echo $text . "\n"; print_r($matches); ?> 

gives:

 abc "string \\ \" literal" def Array ( [0] => Array ( [0] => "string \\ \" literal" ) [1] => Array ( [0] => l ) ) 

as you can see on Ideone .

+10
Apr 17 '11 at 18:01
source share

This seems to be as fast as a detailed loop based on some quick tests, but much easier to read and understand. First of all, it does not require any return.

 "[^"\\]*(\\.[^"\\]*)*" 
+1
Sep 07 '13 at 7:03
source share

It has features:

/"(?>(?:(?>[^"\\]+)|\\.)*)"/

/'(?>(?:(?>[^'\\]+)|\\.)*)'/

0
Apr 17 '11 at 18:12
source share

This will leave quotes out

 (?<=['"])(.*?)(?=["']) 

and using global / g will match all groups

0
Jun 25 '13 at 9:47
source share



All Articles