How to check MIME type (e.g. PDF) of a file and variable string?

I have a bunch of PDF files that were downloaded using a scraper. This scraper did not check if the file was a jpg or pdf, so by default they were all downloaded and saved with the extension ".pdf". So, just to find out all the files in the package, there is a .pdf. However, if I try to open them (files that are not PDF, but JPG) through the server or locally, I got an error.

My question is. Is there a way with PHP to check and see if this file is a valid PDF? I would like to run all the urls through a loop to check these files. There are hundreds of them, and it will take several hours to check.

thanks

+7
php pdf
source share
4 answers

For local files (PHP 5.3 +):

$finfo = finfo_open(FILEINFO_MIME_TYPE); foreach (glob("path/to/files") as $filename) { if(finfo_file($finfo, $filename) === 'application/pdf') { echo "'{$filename}' is a PDF" . PHP_EOL; } else { echo "'{$filename}' is not a PDF" . PHP_EOL; } } finfo_close($finfo); 

For deleted files:

 $ch = curl_init(); $url = 'http://path.to/your.pdf'; curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt($ch, CURLOPT_NOBODY, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $results = split("\n", trim(curl_exec($ch))); foreach($results as $line) { if (strtok($line, ':') == 'Content-Type') { $parts = explode(":", $line); echo trim($parts[1]); // output: application/pdf } } 
+2
source share

Get the MIME type of a file using the function: finfo_file ()

 if (function_exists('finfo_open')) { $finfo = finfo_open(FILEINFO_MIME); $mimetype = finfo_file($finfo, "PATH-TO-YOUR-FILE"); finfo_close($finfo); echo $mimetype; } echo "<pre>"; print_r($mimetype); echo "</pre>"; 
+1
source share

Use finfo_file() function

 <?php if (function_exists('finfo_open')) { $mime = finfo_open(FILEINFO_MIME_TYPE); $mime_type = finfo_file($mime, "FILE-PATH"); if($mime_type == "application/pdf") echo "file is pdf"; else echo "file is not pdf"; finfo_close($mime); } 
+1
source share

Sometimes you need to check the MIME signature of a file, and sometimes a variable. This is how you do both checks:

 $filename = '/path/to/my/file.pdf'; $content = file_get_contents($filename); $file_is_pdf = function(string $filename) : bool { return mime_content_type($filename) === 'application/pdf'; }; $var_is_pdf = function(string $content) : bool { $mime_type_found = (new \finfo(FILEINFO_MIME))->buffer($content); return $mime_type_found === 'application/pdf; charset=binary'; }; // Checks if a file contains a pdf signature. var_dump($file_is_pdf($filename)); // Checks if a variable contains a pdf signature. var_dump($var_is_pdf($content)); 
0
source share

All Articles