The cryptographic hash function will give you a different number for each input line, but this is a fairly large number - 20 bytes in the case of SHA-1, for example. In principle, two lines can have the same hash value, but the likelihood that this will happen is so small that it is considered insignificant.
If you want a smaller number - say, a 32-bit integer - then you cannot use the hash function because the chance of a collision is too high. Instead, you will need to keep a record of all the mappings that you created. Create a database table that associates rows with numbers, and each time you are given a row, find it in the table. If you find it, return the associated number. If not, select a new number that is not used by any of the existing entries, and add a new row and number to the table.
source share