Java Map :: hashCode () collision - why?

Question

Java Map :: hashCode () collision - why?

The following code causes the same hash code to be generated for two cards, any ideas?

import java.util.HashMap; import java.util.Map; public class Foo { @SuppressWarnings("unchecked") public static void main (String[] args) { Map map; map = new HashMap(); map.put("campaignId", 4770L); map.put("location", "MINI_PROFILE"); map.put("active", "true"); map.put("lazy", true); System.out.println(map.hashCode()); map = new HashMap(); map.put("campaignId", 4936L); map.put("location", "MINI_PROFILE"); map.put("active", "true"); map.put("lazy", false); System.out.println(map.hashCode()); } }

Result:

 -1376467648 -1376467648

Simply changing the key names is enough for the code to generate two different hash codes.

+4

java hashcode map collision

Hisham Sep 2 '10 at 20:18

source share

4 answers

I think this is just a coincidence. From the Javadoc for AbstractMap # hashCode ():

The hash code of the card is defined as the sum of the hash codes of each record in the entrySet () map view.

And for Entry # hashCode ():

Returns the hash code value for this entry in the map. The hash code of the e-card entry is defined as:

  (e.getKey()==null ? 0 : e.getKey().hashCode()) ^ (e.getValue()==null ? 0 : e.getValue().hashCode())

Thus, hash codes for maps are based on both keys and values contained in the map. You just experience a strange situation when two cards have the same hash code, for no apparent reason.

+6

Vivien barousse Sep 2 '10 at 20:24

source share

Collisions occur. In fact, you can override hashCode () to always return 0 for each HashMap , and that would be correct (although that would make many structures slow ).

+3

gpeche Sep 2 '10 at 20:28

source share

This is not a coincidence.

String objects are the same in both. The same object will give the same hash code.

0

amit Jun 21 '13 at 12:07

source share

Jon skeet · Accepted Answer · 2010-09-02T20:23:14+0000

Just a coincidence, I suspect ... there will inevitably be collisions, in which case it seems that the corresponding different bits in the first value are lost effectively.

However, this should not make any difference - anything that uses hash codes must deal with conflicts.

EDIT: This is just a way to calculate hashes. This code shows what happens:

 import java.util.*; public class Test { @SuppressWarnings("unchecked") public static void main (String[] args) { AbstractMap.SimpleEntry[] entries = { new AbstractMap.SimpleEntry("campaignId", 4770L), new AbstractMap.SimpleEntry("campaignId", 4936L), new AbstractMap.SimpleEntry("lazy", true), new AbstractMap.SimpleEntry("lazy", false) }; for (AbstractMap.SimpleEntry entry : entries) { System.out.println(entry + ": " + entry.hashCode()); } } }

Results:

 campaignId=4770: -1318251287 campaignId=4936: -1318251261 lazy=true: 3315643 lazy=false: 3315617

So, in one pair the first map has a hash 26 less than the second map, and in the other pair the first map has a hash 26 more than the second map.

AbstractMap simply sums the hash values (one way to make sure the ordering doesn't matter), so both end with the same hash code.

This is valid until Boolean.hashCode() , which looks like this:

 return value ? 1231 : 1237;

... and Long.hashCode() , which looks like this:

 return (int)(value ^ (value >>> 32));

Given the values they selected in Boolean.hashCode() , if your long values are only 26 (or 26 * 2 ^ 32 from each other), you will come across the same thing.

Java Map :: hashCode () collision - why?

More articles: