How to find crc32 large files?

PHP crc32 supports the string as input.And For the file below, OFC will work.

crc32(file_get_contents("myfile.CSV")); 

But if the file becomes huge (2 GB), it can cause a lack of memory. Fatal error.

So, any way to find the checksum of huge files?

+4
source share
3 answers

This function in the user notes for crc32() claims to compute the value without fully loading the file. If it works correctly, it should fix any memory problems.

For a file larger than 2 GB, however, it may stop at the same 32-bit limit that you are facing right now.

If possible, I would call an external tool that can calculate a checksum for files the size of one.

0
source

PHP does not support files larger than 2 GB (32 bit limit)

And a more efficient way to calculate crc32 from files:

 $hash = hash_file('crc32b',"myfile.CSV" ); 
+5
source

dev-null-dweller answer is IMO the way.

However, for those who are looking for an efficient PHP4 file hash_file('crc32b', $filename); , this is a solution based on this comment on the PHP manual , with some improvements:

  • Now it gives exactly the same results as hash_file()
  • It supports 32-bit and 64-bit architectures.

Warning: perfs are ugly. Trying to improve.

Note. I tried the solution based on the C source code from the zaf comment, but I was not able to port it quickly enough to PHP.

 if (!function_exists('hash_file')) { define('CRC_BUFFER_SIZE', 8192); function hash_file($algo, $filename, $rawOutput = false) { $mask32bit = 0xffffffff; if ($algo !== 'crc32b') { trigger_error("Unsupported hashing algorightm '".$algo."'", E_USER_ERROR); exit; } $fp = fopen($filename, 'rb'); if ($fp === false) { trigger_error("Could not open file '".$filename."' for reading.", E_USER_ERROR); exit; } static $CRC32Table, $Reflect8Table; if (!isset($CRC32Table)) { $Polynomial = 0x04c11db7; $topBit = 1 << 31; for($i = 0; $i < 256; $i++) { $remainder = $i << 24; for ($j = 0; $j < 8; $j++) { if ($remainder & $topBit) $remainder = ($remainder << 1) ^ $Polynomial; else $remainder = $remainder << 1; $remainder &= $mask32bit; } $CRC32Table[$i] = $remainder; if (isset($Reflect8Table[$i])) continue; $str = str_pad(decbin($i), 8, '0', STR_PAD_LEFT); $num = bindec(strrev($str)); $Reflect8Table[$i] = $num; $Reflect8Table[$num] = $i; } } $remainder = 0xffffffff; while (!feof($fp)) { $data = fread($fp, CRC_BUFFER_SIZE); $len = strlen($data); for ($i = 0; $i < $len; $i++) { $byte = $Reflect8Table[ord($data[$i])]; $index = (($remainder >> 24) & 0xff) ^ $byte; $crc = $CRC32Table[$index]; $remainder = (($remainder << 8) ^ $crc) & $mask32bit; } } $str = decbin($remainder); $str = str_pad($str, 32, '0', STR_PAD_LEFT); $remainder = bindec(strrev($str)); $result = $remainder ^ 0xffffffff; return $rawOutput ? strrev(pack('V', $result)) : dechex($result); } } 
0
source

Source: https://habr.com/ru/post/1311964/


All Articles