How long can I keep a hash clear?

Question

How long can I keep a hash clear?

If I left a hash on SHA2 on my site - how long will it be considered safe? How much time will I have before I can be sure that someone will find a collision for him and find out what hashed?

I know that the amount of time will be based on the computational power of one who seeks to break it. It will also depend on the length of the string, but I'm curious how hashes are safe.

As many of us start web servers, we must constantly be prepared for the day when someone can do all this before the database that stores user hashes. So, move server protection to the side, and then what do you have?

This is a bit of a theoretical area for many of the people I talked to, so I would like more information about average hacking expectations.

hash('sha256', 'mytext'); hash('sha256', 'thisismytext'); hash('sha256', ' xx$1sw@the4e '); hash('sha256', 'thisismyslightlylongertext'); db695168e73ae294e9c4ea90ff593e211aa1b5693f49303a426148433400d23f b62c6ac579abf8a29e71d98aeba1447c66c69002cfd847b148584f886fd297ef 501f1b26abbc75594d06f0935c8bc502d7bcccf5015227bd6ac95041770acb24 3debc12761bbeb5b4460978ac2be5b104163de02ea799f0705399d0e5b706334

+6

security sha hash

Xeoncross Jan 14 '11 at 2:48

source share

3 answers

Tom's answer is already very detailed, but I would add the following criteria:

What is the advantage of breaking a hash?

Throw a dime on the street. How long will it take until someone picks it up?
Now leave a bill for $ 20 and do the same experiment.

If the value of what you are trying to protect is low , it is possible that no one will try to break the hash at all.

If the value and advantage of breaking a hash is high , it will survive only as long as you need to buy the necessary processing power from the Amazon cloud. Now they even sell GPUs.

+4

rdmueller Jan 20 '11 at 21:32

source share

You assume that the rainbow targeting table for your customization will be available, which is not data. IMNSHO, consider it broken at the time of its leak. Even with bcrypt, you cannot be sure how much work was done before your hash became public.

0

TryPyPy Jan 14 '11 at 3:31

source share

Thomas pornin · Accepted Answer · 2011-01-14T13:26:27+0000

First, you are not talking about a collision. A collision is when someone finds two different messages whose hash has the same value. Here you are not afraid that someone will find another entry whose hash you publish; indeed, you are afraid that someone will find your contribution. The correct term is prefix attack . Sometimes we say that an attacker tries to “invert” a hash function (find the input corresponding to this output).

There are two ways to search for the pre-image for a given hash value: use the hash function weakness or guess the input by overturning candidates.

There is no known SHA-2 weakness with respect to providence. Come to this, there is no such known weakness for MD5 or even MD4, although these two functions are considered cryptographically decomposed. Therefore, not allowing tremendous success in the scientific research of hash functions, the likelihood that your hash value will not be found through the cryptographic weakness of the hash function.

Attempting candidates may or may not, depending on what the attacker knows about the input. It is quite difficult to accurately model. Suppose, for example, that the input is a single word containing seven letters. There are 26 words ⁷ = 8031810176. Try all of them with SHA-256, comparing each time with your hash value, it takes several minutes on a recent PC with a naive implementation.

On a more general basis, studying a set of possible inputs is called a dictionary attack, since it is often applied to the problem of recovering a user's password: users are depressingly unimaginable and often choose passwords from a limited set, well, “words”, and it is logical to call a “dictionary” this set of words. We also call it "brute force" or "exhaustive search."

Assuming the dictionary is small enough for the attacker to realistically try all his words, then not only will your hash value be ultimately inverted (if there is sufficient incentive for the attacker), but it also opens the way for <strong cost sharing: the attacker can try to share your computer efforts in several similar attack situations (i.e. several hash values for inversion with the same hash function - again, a common password-related attack model). The main cost-sharing method is to make a pre-computed table : an attacker calculates all the hashes for his dictionary once; then all subsequent hash values can be attacked by simply looking at the hash value in the table. The search is very fast (the attacker sorts his hashes in ascending order). Rainbow tables are a kind of pre-computed table, a smart way that allows you to get a compact view: they allow an attacker to "hold" a large pre-computed table without having to load it on your hard drive. However, rainbow or not, all values in the table (one before compression in the case of the rainbow table) must be calculated at least once by an attacker somewhere, that is, someone can make a full dictionary attack. This has two overheads: processor cost (for calculating all hashes) and storage cost (for storing hash values). Rainbow table makes storage cheaper, but does not improve processor performance.

Salting defeats pre-computed tables (including rainbow tables). This makes dictionaries more bearable. That is, if we assume that inverting one value of the hash function is feasible, then the salt ensures that at least the attacker will have to pay the full cost of the processor each time to attack the dictionary, and he will not be able to share its cost in several attacks or other attackers. Passwords require salting, since in general it was impossible to make universal users select and remember passwords from a sufficiently large set of possible passwords.

It is still much better if your entry from the dictionary is large enough to defeat one forced effort. The size of the value set that your input string can take is important; this set should be evaluated as to what the attacker knows about the data being attacked. For example, if an attacker tries to find the user's password, he knows that the input string is short (users have little patience) and consists only of characters that can be entered (blindly!) On the keyboard; and he also knows that the sequence can be remembered, which makes things like ".% f * (. ds / ~ \ d09j @" absolutely unbelievable. There is no limit on the size of the input, we say that the rainbow tables are limited to "15 characters or so "because users who accept more than 15 characters to enter also select passwords from too large a set to provide the only brute force required to build the table. Note that trying all 15 character sequences is already too many (even the whole sequence 15 lowercase letters mean more than 2 ⁷⁰ hash computations, and it is not really possible with today's technology).

How long can I keep a hash clear?

More articles: