Regular expression to match an unlimited number of parameters

I want to be able to parse file paths like this:

/var/www/index.(htm|html|php|shtml) 

into an ordered array:

  array("htm", "html", "php", "shtml") 

and then create a list of alternatives:

 /var/www/index.htm /var/www/index.html /var/www/index.php /var/www/index.shtml 

Now I have a preg_match statement that can separate the two alternatives:

  preg_match_all ("/\(([^)]*)\|([^)]*)\)/", $path_resource, $matches); 

Can someone give me a pointer on how to extend this to accept an unlimited number of alternatives (at least two)? Just regarding regex, the rest I can deal with.

Rule:

  • The list should begin with ( and close with )

  • The list must have one | (i.e. at least two alternatives)

  • Any other event ( or ) should remain intact.

Update: I also need to deal with several pair brackets, such as:

  /var/(www|www2)/index.(htm|html|php|shtml) 

Sorry, I did not say this right away.

Update 2: If you want to do what I am trying to do on the file system, note that glob () already displays this functionality out of the box. There is no need to implement custom solutiom. See @Gordon below for more details.

+6
php regex preg-match
source share
5 answers

Solution without regex :)

 <?php $test = '/var/www/index.(htm|html|php|shtml)'; /** * * @param string $str "/var/www/index.(htm|html|php|shtml)" * @return array "/var/www/index.htm", "/var/www/index.php", etc */ function expand_bracket_pair($str) { // Only get the very last "(" and ignore all others. $bracketStartPos = strrpos($str, '('); $bracketEndPos = strrpos($str, ')'); // Split on ",". $exts = substr($str, $bracketStartPos, $bracketEndPos - $bracketStartPos); $exts = trim($exts, '()|'); $exts = explode('|', $exts); // List all possible file names. $names = array(); $prefix = substr($str, 0, $bracketStartPos); $affix = substr($str, $bracketEndPos + 1); foreach ($exts as $ext) { $names[] = "{$prefix}{$ext}{$affix}"; } return $names; } function expand_filenames($input) { $nbBrackets = substr_count($input, '('); // Start with the last pair. $sets = expand_bracket_pair($input); // Now work backwards and recurse for each generated filename set. for ($i = 0; $i < $nbBrackets; $i++) { foreach ($sets as $k => $set) { $sets = array_merge( $sets, expand_bracket_pair($set) ); } } // Clean up. foreach ($sets as $k => $set) { if (false !== strpos($set, '(')) { unset($sets[$k]); } } $sets = array_unique($sets); sort($sets); return $sets; } var_dump(expand_filenames('/(a|b)/var/(www|www2)/index.(htm|html|php|shtml)')); 
+3
source share

I think you are looking for:

/ (([^ |] +) (| ([^ |] +)) +) /

In principle, put the delimiter '|' in a repeating pattern.

In addition, your words should be composed of "not pipes" instead of "not parens", according to your third requirement.

Also, for this problem, prefer + - * . + means at least one. * means "zero or more."

+5
source share

Not quite what you ask, but whatโ€™s wrong, just accepting what you need to get the list (ignoring | s), putting it in a variable, and then explode ing to | s? This will give you an array of many elements that were (including 1 if not).

+4
source share

Perhaps I still do not ask a question, but, according to my assumption, you work through the file system until you click one of the files, in which case you could do

 $files = glob("$path/index.{htm,html,php,shtml}", GLOB_BRACE); 

The resulting array will contain any file matching your extensions in $ path or none. If you need to include files for a specific extension order, you can foreach over an array with an ordered list of extensions, for example.

 foreach(array('htm','html','php','shtml') as $ext) { foreach($files as $file) { if(pathinfo($file, PATHINFO_EXTENSION) === $ext) { // do something } } } 

Edit: and yes, you can have multiple curly braces in glob.

+2
source share

The answer is given, but this is a fun puzzle, and I just could not resist

 function expand_filenames2($str) { $r = array($str); $n = 0; while(preg_match('~(.*?) \( ( \w+ \| [\w|]+ ) \) (.*) ~x', $r[$n++], $m)) { foreach(explode('|', $m[2]) as $e) $r[] = $m[1] . $e . $m[3]; } return array_slice($r, $n - 1); } print_r(expand_filenames2('/(a|b)/var/(ignore)/(www|www2)/index.(htm|html|php|shtml)!')); 

maybe this explains a little why we like regexps so much;)

+1
source share

All Articles