Writing an expression to recursively extract data between brackets

I am trying to write a regular expression to break a string into separate elements inside the corresponding curly braces. First of all, it must be recursive, and secondly, it must return offsets (for example, using PREG_OFFSET_CAPTURE ).

In fact, I think this is probably a less efficient way to process this data, but I'm not sure about a simpler and better performance. (If you have, I would love to hear it!)

So, the input can be in this format:

 Hello {#name}! I'm a {%string|sentence|bit of {#random} text} 

Data processing is quite simple if in this format:

 Hello {#name}! I'm a {%string|sentence|bit of random text} 

But these are recursive curly braces inside another set of curly braces, which is a problem when it comes to processing. I use the following code to split the string:

 preg_match_all("/(?<={)[^}]*(?=})/m", $string, $braces, PREG_OFFSET_CAPTURE); 

And, as mentioned earlier, this is very good for a simple form. Similarly for a more complex shape. The intention for this (and I use it in a non-recursive form) is to replace each area in parentheses with content that is processed by functions that work up.

Ideally, I would like to write Hello {#name}! I'm a {%string|sentence|bit of {?(random == "strange") ? {#random} : "strange"}} text} Hello {#name}! I'm a {%string|sentence|bit of {?(random == "strange") ? {#random} : "strange"}} text} Hello {#name}! I'm a {%string|sentence|bit of {?(random == "strange") ? {#random} : "strange"}} text} and be manageable for it.

Any help would be greatly appreciated.

+5
source share
1 answer

You can use the PCRE regular expression features to capture groups in options and routines to get nested substrings {...} .

A demo version of regex is available here .

 $re = "#(?=(\{(?>[^{}]|(?1))*+\}))#"; $str = "Hello {#name}! I'm a {%string|sentence|bit of {#random} text}"; preg_match_all($re, $str, $matches, PREG_OFFSET_CAPTURE); print_r($matches[1]); 

Watch the IDEONE demo

It will return an array with captured {...} - like strings and their positions:

 Array ( [0] => Array ( [0] => {#name} [1] => 6 ) [1] => Array ( [0] => {%string|sentence|bit of {#random} text} [1] => 21 ) [2] => Array ( [0] => {#random} [1] => 46 ) ) 
+2
source

All Articles