Is SHA enough to check for duplicate files? (sha1_file in PHP)

Suppose you wanted to make a site for hosting files so that people could upload their files and send a link to their friends to get them later, and you want the files to be duplicated where we store them, is PHP sha1_file suitable enough for the task? Is there a reason not to use md5_file instead?

For the external interface, it will be hidden using the original file name repository in the database, but there will be some additional problems if it opens something about the original poster. Does the file save any meta-information with it, as the last change, or who sent it, or is it material based on the file system?

In addition, the use of salt is frivolous, since security against the rainbow table attack means nothing for this, and the hash can subsequently be used as a checksum?

Last, scalability? initially it will be used only for small files of a few megabytes, but in the end ...

Edit 1: The hash point in the first place is to avoid duplicate files, and not to create obscurity.

+5
source share
4 answers

According to my comment on @ykaganovich's answer, SHA1 (surprisingly) is slightly faster than MD5.

- - / - - , ( ). md5 , . sha1. , , 2 warez . ?

, -, - .

+2

SHA "" . , , "Git Magic", :

.1. SHA1        SHA1 .        . , ,        , Git.       , Git -, SHA1.

SHA256 , . MD5 , SHA1.

+1

sha1_file ?

sha1_file , , . 0, :

function is_duplicate_file( $file1, $file2)
{   
    if(filesize($file1) !== filesize($file2)) return false;

    if( sha1_file($file1) == sha1_file($file2) ) return true;

    return false;
}

md5 , sha1, , md5 .

, , , :

1- :

if( file_get_contents($file1) != file_get_contents($file2) )

2- Sha1_file

if( sha1_file($file1) != sha1_file($file2) )

3- md5_file

if( md5_file($file1) != md5_file($file2) )

: 2 1.2 100 , :

--------------------------------------------------------
 method                  time(s)           peak memory
--------------------------------------------------------
file_get_contents          0.5              2,721,576
sha1_file                  1.86               142,960
mdf5_file                  1.6                142,848

file_get_contents 3,7 , sha1, .

Sha1_file md5_file , 5% , file_get_contents.

md5_file , , sha1.

, , , .

+1

. sha1 - -, md5, , , , , , md5:). - , / ( , ). . , .

, , , , IO-, , , , . .

0

All Articles