I store a large number of objects (with unique combinations of values stored in a byte array in the object) in hashmap (~ 2.8 million objects), and when checking if I have a hash clash (32-bit hash), I am very surprised seeing that there are no statistics, I have almost a 100% chance of having at least one collision (see http://preshing.com/20110504/hash-collision-probabilities/ ).
I'm curious to ask if my approach to detecting collisions is being listened to or if I’m very lucky ...
This is how I try to detect collisions of 2.8 million values stored on a map:
HashMap<ShowdownFreqKeysVO, Double> values;
(...fill with 2.8 mlns unique values...)
HashSet<Integer> hashes = new HashSet<>();
for (ShowdownFreqKeysVO key:values.keySet()){
if (hashes.contains(key.hashCode())) throw new RuntimeException("Duplicate hash for:"+key);
hashes.add(key.hashCode());
}
And here is the object approach for creating a hash value:
public class ShowdownFreqKeysVO {
public byte[] values = new byte[12];
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + Arrays.hashCode(values);
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
ShowdownFreqKeysVO other = (ShowdownFreqKeysVO) obj;
if (!Arrays.equals(values, other.values))
return false;
return true;
}
}
Any idea / hint that I'm doing wrong will be greatly appreciated!
Thank you Thomas