MD5 is a hash, as you know, so if you give it an input, for example "PASSWORD", you will get a unique one (I hope, however, MD5 has a collision these days), for example, "3DE2AF ...".
Now, as you know, itโs quite difficult to directly reverse this until someone thinks ... wait, why don't I create all possible combinations of hash values โโuntil I can change the hash. This is called a rainbow table .
The purpose of the salt is to add arbitrary random data to the hashed string to increase the length of the input to the hash. This means that shared rainbow tables that expect to cancel the password hash will not work. Of course, rainbow tables just turn in the opposite order, you can simply create a rainbow table to compensate for all possible outputs with a password + salt. This is where the increase in length comes to life; due to the nature of reversible hashes, the disk space for generating feedback signals for very long hash inputs is soon becoming impossible. Rainbow alphanumeric tables for 6-8 characters are already a couple of gigabytes; increase the length and character classes and you will begin to speak tenfold.
Of course, if you put in a "PASSWORD" and you have a "PASSWORD", you haveh the "PASSWORDPASSWORD", which is not much safer, so choosing a salt is also important. Ideally, you should use a random salt with each hashed line, but of course you need to know what it is. A common technique is to get salt from a username or other property unique to this case. Adding arbitrary data is not useful in itself; the presence of custom salt data now adds an additional level of complexity, that is, rainbow tables are needed with specialized searches for each user. The more you make it harder, the more processing power is needed. This is where the battle is.
However, there are some modern methods. I am not an expert, so I canโt tell you how safe it is, but they are worth mentioning. The concept is slow hashing. Basically, through complex hash functions, you should spend some time figuring out each hash. Thus, the ability of each user to verify the password now has a constant amount of time added for each password that you want to verify. If you're rude, this is Bad News (tm). Similarly, if the system is well designed, if there are no shortcuts (which probably equate to weaknesses), then generating a rainbow table for a slow hash function should also take some time.
Change details here. See crypt() for a first example of this. @CodeInChaos refers to PBKDF2 , which is part of PKCS # 5 . Newer scrypt development.
As I said, I'm not an expert cryptanalyst. In the last example, I donโt have specific specific knowledge about its suitability, I just show you where everything is going.
Edit 2 Refined my salt - I think I used to dance around a key disk space problem.