File containing its own checksum

Is it possible to create a file that will contain its own checksum (MD5, SHA1, whatever)? And to upset the jokers, I mean the checksum in the normal mode, and not the function that calculates it.

+43
security checksum data-integrity
Jul 13 '09 at 7:04
source share
12 answers

Yes. This is possible, and this is common with simple checksums. Getting the file to include its own md5sum would be quite difficult.

In the most basic case, create a checksum value that will cause the sum module to be zero. The checksum function becomes something like

(n1 + n2 ... + CRC) % 256 == 0 

If the checksum becomes part of the file and is checked by itself. A very common example of this is the Luhn algorithm used in credit card numbers. The last digit is a check digit and is itself part of a 16-digit number.

+17
Jul 13 '09 at 7:17
source share

I created a piece of C code and then typed bruteforce in less than 2 minutes and got this miracle:

 The CRC32 of this string is 4A1C449B 

Please note that there should be no characters after the sentence (end of line, etc.).

You can check it here: http://www.crc-online.com.ar/index.php?d=The+CRC32+of+this+string+is+4A1C449B&en=Calcular+CRC32

This is fun too:

 I killed 56e9dee4 cows and all I got was... 

The source code (sorry, this is a bit dirty) is here: http://www.latinsud.com/pub/crc32/

+30
Jul 26 '11 at 17:09
source share

Check this:

 echo -e '#!/bin/bash\necho My cksum is 918329835' > magic 
+12
Jul 25 2018-12-12T00:
source share

Of course it is possible. But one of the uses of checksums is to detect file fraud - how do I know if a file has been modified if the modifier can also replace the checksum?

+7
Jul 13 '09 at 7:06
source share

"I want my crc32 to be 802892ef ..."

Well, I thought it was interesting, so today I have encoded a small java program for searching for collisions. Thought I'd leave it here if someone found this useful:

 import java.util.zip.CRC32; public class Crc32_recurse2 { public static void main(String[] args) throws InterruptedException { long endval = Long.parseLong("ffffffff", 16); long startval = 0L; // startval = Long.parseLong("802892ef",16); //uncomment to save yourself some time float percent = 0; long time = System.currentTimeMillis(); long updates = 10000000L; // how often to print some status info for (long i=startval;i<endval;i++) { String testval = Long.toHexString(i); String cmpval = getCRC("I wish my crc32 was " + testval + "..."); if (testval.equals(cmpval)) { System.out.println("Match found!!! Message is:"); System.out.println("I wish my crc32 was " + testval + "..."); System.out.println("crc32 of message is " + testval); System.exit(0); } if (i%updates==0) { if (i==0) { continue; // kludge to avoid divide by zero at the start } long timetaken = System.currentTimeMillis() - time; long speed = updates/timetaken*1000; percent = (i*100.0f)/endval; long timeleft = (endval-i)/speed; // in seconds System.out.println(percent+"% through - "+ "done "+i/1000000+"M so far" + " - " + speed+" tested per second - "+timeleft+ "s till the last value."); time = System.currentTimeMillis(); } } } public static String getCRC(String input) { CRC32 crc = new CRC32(); crc.update(input.getBytes()); return Long.toHexString(crc.getValue()); } } 

Exit:

 49.825756% through - done 2140M so far - 1731000 tested per second - 1244s till the last value. 50.05859% through - done 2150M so far - 1770000 tested per second - 1211s till the last value. Match found!!! Message is: I wish my crc32 was 802892ef... crc32 of message is 802892ef 

Note that the dots at the end of the message are actually part of the message.

On my i5-2500, it took you ~ 40 minutes to find the entire crc32 space from 00000000 to ffffffff, doing about 1.8 million tests per second. This maximized one core.

I am new to Java, so any constructive comments on my code will be appreciated.

"My crc32 was c8cb204, and all I got was this lousy t-shirt!"

+7
Mar 11 '13 at 13:31 on
source share

Of course, you can combine the digest of the file itself at the end of the file. To check this, you calculated the digest of everything except the last part, and then compare it with the value in the last part. Of course, without any form of encryption, anyone can recount the digest and replace it.

change

I must add that this is not so unusual. One method is to combine CRC-32, so that the CRC-32 of the entire file (including this digest) is zero. However, this will not work with digests based on cryptographic hashes.

+4
Jul 13 '09 at 7:08
source share

I don't know if I understood your question correctly, but you can make the first 16 bytes of the file a checksum of the rest of the file.

So, before writing the file, you calculate the hash, first write the hash value, and then write the contents of the file.

+2
Jul 13 '09 at 7:06
source share

If the question asks whether the file can contain its checksum (in addition to other content), the answer is trivial for fixed-size checksums, since the file can contain all possible checksum values.

If the question is whether the file can consist of its own checksum (and nothing more), it is trivial to build a checksum algorithm that would make such a file impossible: for an n-byte checksum, take a binary representation of the first n bytes of the file and add 1. Since it is also trivial to construct a checksum that always encodes itself (i.e. does it above without adding 1), it is obvious that there are some checksums that can encode themselves, and some that cannot. It would probably be quite difficult to determine which one is the standard checksum.

+1
May 04 '10 at 22:17
source share

There is a neat implementation of the Luhn Mod N algorithm in the python-stdnum library ( see luhn.py ). The calc_check_digit function will calculate a digit or character that, when added to a file (expressed as a string), will create a valid Luhn Mod N string. As noted in many answers above, this makes it possible to verify the authenticity of the file, but does not have significant protection against unauthorized access. The recipient will need to know which alphabet is used to determine the validity of the Lon mods.

+1
Sep 06 '11 at 20:59
source share

Sure.

The easiest way is to run the file through the MD5 algorithm and paste this data into the file. You can split the checksum and place it at known points in the file (based on the portion size of the file, e.g. 30%, 50%, 75%) if you want to try to hide it.

Similarly, you can encrypt the file or encrypt part of the file (along with the MD5 checksum) and paste it into the file. Edit I forgot to say that you will need to delete the checksum data before using it.

Of course, if your file should be easily readable by another program, for example. The word then becomes a little more complicated, since you do not want to β€œcorrupt” the file so that it is not readable.

0
Jul 13 '09 at 7:20
source share

Of course you can, but in this case the SHA digest of the entire file will not include the SHA, because it is a cryptographic hash function, so changing one bit in the file changes the entire hash. What you are looking for is checksum calculated using the contents of the file to match a set of criteria.

0
Jul 13 '09 at 7:22
source share

There are many ways to embed information to detect transmission errors, etc. CRC checksums are good at detecting runs of consecutive bit flips and can be added so that the checksum is always, for example, 0. These checksums (including error correction codes), however, are easy to recreate and do not stop malicious interference.

It is not possible to insert something into the message so that the recipient can verify its authenticity if the recipient does not know anything about / from the sender. The recipient may, for example, share a secret key with the sender. Then the sender can add an encrypted checksum (which should be cryptographically secure, for example, md5 / sha1). It is also possible to use asymmetric encryption when the sender can publish his public key and sign the md5 checksum / hash with his private key. The hash and signature can be marked on the data as a new type of checksum. This is done all the time on the Internet these days.

The remaining problems are then equal to 1. How can the recipient make sure that he has the correct public key and 2. How safe is all this material in reality ?. The answer to 1 may vary. On the Internet it is generally accepted that a public key is signed by someone whom everyone trusts. Another simple solution is that the recipient received the public key to the meeting in person ... The answer to question 2 may change from day to day, but what is expensive per day is likely to be cheap to break some time in future, By the time, we hope, new algorithms and / or increased key sizes have appeared.

0
May 04 '10 at 21:36
source share



All Articles