PHP and RegEx: Separate a string with commas that are not in brackets (as well as nested brackets)

Two days ago, I started working on a code parser, and I'm stuck.

How can I separate a line with commas that are not inside the brackets, let me show you what I mean:

I have this line for parsing:

one, two, three, (four, (five, six), (ten)), seven 

I would like to get this result:

 array( "one"; "two"; "three"; "(four, (five, six), (ten))"; "seven" ) 

but instead I get:

 array( "one"; "two"; "three"; "(four"; "(five"; "six)"; "(ten))"; "seven" ) 

How to do it in PHP RegEx.

Thank you in advance!

+7
split php regex parsing
source share
7 answers

You can make it easier:

 preg_match_all('/[^(,\s]+|\([^)]+\)/', $str, $matches) 

But it would be better if you used a real parser. Maybe something like this:

 $str = 'one, two, three, (four, (five, six), (ten)), seven'; $buffer = ''; $stack = array(); $depth = 0; $len = strlen($str); for ($i=0; $i<$len; $i++) { $char = $str[$i]; switch ($char) { case '(': $depth++; break; case ',': if (!$depth) { if ($buffer !== '') { $stack[] = $buffer; $buffer = ''; } continue 2; } break; case ' ': if (!$depth) { continue 2; } break; case ')': if ($depth) { $depth--; } else { $stack[] = $buffer.$char; $buffer = ''; continue 2; } break; } $buffer .= $char; } if ($buffer !== '') { $stack[] = $buffer; } var_dump($stack); 
+10
source share

Hm ... OK is already marked as an answer, but since you asked for a simple solution, I will try though:

 <?php $test = "one, two, three, , , ,(four, five, six), seven, (eight, nine)"; $split = "/([(].*?[)])|(\w)+/"; preg_match_all($split, $test, $out); print_r($out[0]); die(); ?> 

Exit

 Array ( [0] => one [1] => two [2] => three [3] => (four, five, six) [4] => seven [5] => (eight, nine) ) 
+7
source share

You cannot, right. You would need at least a variable lookbehind width, and finally I knew that PHP PCRE only has a fixed lookbehind width.

My first recommendation would be to first extract the parenthesized expressions from the string. However, I don’t know anything about your real problem, so I don’t know if this will be feasible.

+5
source share

I can't think of a way to do this using one regex, but it's pretty easy to hack something that works:

 function process($data) { $entries = array(); $filteredData = $data; if (preg_match_all("/\(([^)]*)\)/", $data, $matches)) { $entries = $matches[0]; $filteredData = preg_replace("/\(([^)]*)\)/", "-placeholder-", $data); } $arr = array_map("trim", explode(",", $filteredData)); if (!$entries) { return $arr; } $j = 0; foreach ($arr as $i => $entry) { if ($entry != "-placeholder-") { continue; } $arr[$i] = $entries[$j]; $j++; } return $arr; } 

If you call it like this:

 $data = "one, two, three, (four, five, six), seven, (eight, nine)"; print_r(process($data)); 

It outputs:

 Array ( [0] => one [1] => two [2] => three [3] => (four, five, six) [4] => seven [5] => (eight, nine) ) 
+2
source share

Clumsy, but he does the job ...

 <?php function split_by_commas($string) { preg_match_all("/\(.+?\)/", $string, $result); $problem_children = $result[0]; $i = 0; $temp = array(); foreach ($problem_children as $submatch) { $marker = '__'.$i++.'__'; $temp[$marker] = $submatch; $string = str_replace($submatch, $marker, $string); } $result = explode(",", $string); foreach ($result as $key => $item) { $item = trim($item); $result[$key] = isset($temp[$item])?$temp[$item]:$item; } return $result; } $test = "one, two, three, (four, five, six), seven, (eight, nine), ten"; print_r(split_by_commas($test)); ?> 
+2
source share

I am afraid it would be very difficult to parse the enclosed brackets, for example one, two, (three, (four, five)) with RegExp only.

+1
source share

I feel it is worth noting that you should always avoid regular expressions whenever possible. For this purpose you should know that for PHP 5.3+ you can use str_getcsv () . However, if you work with files (or file streams), such as CSV files, then the fgetcsv () function may be what you need, and it is available with PHP4.

Finally, I am surprised that no one used preg_split () , or did it not work as needed?

+1
source share

All Articles