Cryptanalysis: XOR of two plaintext files

I have a file that contains the result of two XORed plaintext files. How can I attack this file to decrypt any of the plaintext files? I searched quite a lot, but could not find the answers. Thanks!

EDIT:

Well, I also have two ciphertexts that I XORed to get the XOR of two plaintexts. The reason I ask this question is because, according to Bruce Schneier, p. 198, Applied Cryptography, 1996 "... she can XOR them together and receive two XORed text messages with each other. It's easy to break and then she can XOR one of the plaintext with ciphertext to get the key. " (This applies to a simple stream cipher). But apart from this, he did not give any explanation. That is why I asked here. Forgive my ignorance.

In addition, the algorithm used is simple and uses a symmetric key whose length is 3.

FURTHER EDITION:

I forgot to add: I assume that a simple stream cipher was used for encryption.

+7
source share
5 answers

I am not a cryptanalyst, but if you know something about file characteristics, you may have a chance.

For example, let's say that you know that both original texts:

  • contains plain ASCII text
  • - articles about sports (or something else)

Given these two pieces of information, one approach you can take is to scan through the “decryption” of the ciphertext using the words you can expect in them, such as “football,” “player,” “score,” etc. d. Decrypt using “football” at position 0 of the ciphertext, then at position 1, then 2, etc.

If the result of decrypting a sequence of bytes is represented by a word or a word, you have a good chance that you found plain text from both files. This may give you a clue as to some surrounding clear text, and you can see if this will lead to a reasonable decryption. And so on.

Repeat this process with other words / phrases / snippets that can be expected in plain text.


In response to editing the question: what Schneier is talking about is that if someone has two encrypted texts that were XOR encrypted using the same key, XORing these encrypted texts will “cancel” the key stream , because the:

(A ^ k) - ciphertext of A (B ^ k) - ciphertext of B (A ^ k) ^ (B ^ k) - the two ciphertexts XOR'ed together which simplifies to: A ^ B ^ k ^ k - which continues to simplify to A ^ B ^ 0 A ^ B 

So, now the attacker has a new ciphertext consisting of only two plaintexts. If the attacker knows one of the plaintexts (say, the attacker has legitimate access to A, but not B), which can be used to recover another plaintext:

 A ^ (A ^ B) (A ^ A) ^ B 0 ^ B B 

The attacker now has plaintext for B.

This is actually worse if the attacker has A and the ciphertext for A, then he can recover the key stream already.

But the guessing method that I gave above is a variant of the above with an attacker using (I hope good) guesses instead of the known plaintext. Obviously, this is not so simple, but it is one and the same concept, and this can be done without starting with the well-known clear text. Now the attacker has an encrypted text that “tells” him when he correctly guessed about some plaintext (because it leads to another plaintext from the decryption). Thus, even if the key used in the original XOR operation is a random rascal, an attacker can use a file in which this random delirium is “deleted” in order to obtain information when he makes reasonable assumptions.

+7
source

You need to take advantage of the fact that both files are plain text. There are many consequences that can be derived from this fact. Assuming both texts are English texts, you can use the fact that some letters are much more popular than others. See this article .

Another hint is to note the structure of the correct text in English. For example, each time one statement ends, and then the sequence begins (period, space, capital letter).

Please note that in ASCII code the space is binary “0010 0000”, and changing this bit in the letter will change the case of letters (from lower to upper and vice versa). There will be a lot of XORing using space if both files are plain text, right? An analysis of the printable character table on this page .

In addition, you can use spell checking at the end.

I know that I did not offer a solution for your question. I just gave you some hints. Have fun and please share your results. This is a really interesting task.

+5
source

It is interesting. Schneier's book really says it's easy to break. And then he seems to get away from it. I think you need to leave some exercises to the reader!

There is an article by Dawson and Nilson that apparently describes an automated process for this task for text files. This is a bit on the $$ side to buy a separate article. However, a second article entitled Natural Language Approach to Automated Cryptanalysis of Two-Local Blocks refers to the work of Dawson and Nielsen and describes some of the assumptions they made (mostly the text was limited to 27 characters). But this second document appears to be freely available and describes their own system. I do not know for sure that it is free, but it is openly available on the Johns Hopkins University server.

This article is about 10 pages long and looks interesting. I do not have time to read it at the moment, but later. I’m curious (and telling) that a 10-page document is required to describe a task that another cryptographer describes as “light”.

+4
source

I don't think you can - without knowing anything about the structure of these two files.

0
source

If you do not have one of the plaintext files, you cannot get the source information of the other. Mathematically expressed:

 p1 XOR p2 = en 

You have one equation with two unknowns, you cannot get something meaningful from it.

-one
source

All Articles