I am currently playing with hashing and key generation, trying to create my own hash key generator.
At the moment I have a list of 90,000 lines (every 1 word and another word). I was wondering what is the best way to generate keys (numeric keys, not string keys)?
Currently, depending on the word last ascii, I am doing a calculation based on the meaning of the letter.
As a result, about 50% of the words generate a key that collides with another.
I used quadratic probing to then find a place in the table for the rest of the words.
My question, as above, is usually the best way to generate a key for 90,000 different words? I know that the larger the data set, the more likely collisions will be, but how would you suggest / or minimize collisions?
Edit: Also - I don't need cryptography, it just needs to be fast.
Thanks.
source share