Recently, I have had a serious headache while analyzing metadata from video files, and the part of the problem found is the neglect of various standards (or at least differences in interepretation) by video production software providers (and other reasons).
As a result, I need to be able to scan through very large video (and image) files of various formats, containers and codecs and dig up metadata. I already have FFMpeg, ExifTool Imagick and Exiv2 for processing different types of metadata in different types of files and using various other options to fill in some other spaces (please do not offer libraries or other tools, I tried all of them :)).
Now I get to scan large files (up to 2 GB each) for the XMP block (which is usually written to movie files using the Adobe package and other software). I wrote a function to do this, but I am worried that it can be improved.
function extractBlockReverse($file, $searchStart, $searchEnd) { $handle = fopen($file, "r"); if($handle) { $startLen = strlen($searchStart); $endLen = strlen($searchEnd); for($pos = 0, $output = '', $length = 0, $finished = false, $target = ''; $length < 10000 && !$finished && fseek($handle, $pos, SEEK_END) !== -1; $pos--) { $currChar = fgetc($handle); if(!empty($output)) { $output = $currChar . $output; $length++; $target = $currChar . substr($target, 0, $startLen - 1); $finished = ($target == $searchStart); } else { $target = $currChar . substr($target, 0, $endLen - 1); if($target == $searchEnd) { $output = $target; $length = $length + $endLen; $target = ''; } } } fclose($handle); return $output; } else { throw new Exception('not found file'); } return false; } echo extractBlockReverse("very_large_video_file.mov", '<x:xmpmeta', '</x:xmpmeta>');
This is fine at the moment, but I would really like to get the most out of php here without prejudice to my server, so I'm wondering if there is a better way to do this (or code tricks that improve it), since this approach seems a bit higher vertices for something as simple as finding a couple of lines and pulling something in between.