A regular expression is required to match a string containing "http: //" and ends with a file extension from the array

looking for a php solution that finds a match for the following expression:

  • The URL contains "http: //" (not necessarily starting with http: //) And
  • The URL ends with a file extension from the array.

Example file extension array

$filetypes = array( jpg, gif, png, js, tif, pdf, doc, xls, xlsx, etc); 

Here is the working code that I want to update using the above requirements:

This code currently works and only returns a URL containing "http: //", but I also want to include the second requirement.

 $i = 0; $matches = false; foreach($all_urls as $index => $value) { if (preg_match('/http:/', $value)) { $i++; echo "[{$i}] {$value}<br>"; $matches = true; } } 
+5
source share
2 answers

You can simply make an in_array() call in your if statement, where you check pathinfo() if the extension is in the $filetypes .

 $i = 0; $matches = false; foreach($all_urls as $index => $value) { if (preg_match('/http:/', $value) && in_array(pathinfo($value, PATHINFO_EXTENSION ), $filetypes)) { $i++; echo "[{$i}] {$value}<br>"; $matches = true; } } 

EDIT:

As you said in the comments that multiple urls contain single quotes, you can simply use this to get rid of them as @Ghost showed this in the comments:

 trim($value, "'") 

Then use it in the in_array () call like this:

 in_array(pathinfo(trim($value, "'"), PATHINFO_EXTENSION ), $filetypes) //^^^^^^^^^^^^^^^^^ 
+5
source

A simpler solution would use a simple regex:

 $i = 0; $matches = false; foreach($all_urls as $index => $value) { if (preg_match("/^http:\/\/.+\.(jpg|gif|png|js|tif|pdf|doc|xls|xlsx|etc)$/", $value)) { $i++; echo "[{$i}] {$value}<br>"; $matches = true; } } 

This ensures that the match starts with http: // (because of ^ ) and ends in .jpg or similarly (because of the or'ed and $ list).

If you want to support https, you can simply use:

 /^https?:\/\/.+\.(jpg|gif|png|js|tif|pdf|doc|xls|xlsx|etc)$/ 
0
source

All Articles