Overriding hashCode with overridden equal using equalsIgnoreCase to check equality

Question

Overriding hashCode with overridden equal using equalsIgnoreCase to check equality

Currently, I have got overridden equals(Object) which looks like this:

 @Override public boolean equals(Object o) { if (o == this) return true; if (! (o instanceof Player)) return false; Player p = (Player) o; return getFirstName().equalsIgnoreCase(p.getFirstName()) && getLastName().equalsIgnoreCase(p.getLastName()); }

Now my hashCode() looks like this:

 @Override public int hashCode() { int result = 17; result = 31 * result + getFirstName().toLowerCase().hashCode(); result = 31 * result + getLastName().toLowerCase().hashCode(); return result; }

My question is about my overridden hashCode () method. I know that I need hashCode () to return the same value for two objects if they are considered equal to the equals (Object) method. My gut tells me that there is a case where this hash code () violates the contract.

Is there an acceptable way to use the equalsIgnoreCase (String) method in an overridden equal (Object) method and generate a hash code that does not break the contract?

+7

java equals override hashcode

Jazzer Mar 26 '13 at 3:23

source share

4 answers

You're right. We can scroll through all the char lines and find the pairs s1,s2 that are s1.equalsIgnoreCase(s2) && !s1.toLowerCase().equals(s2.toLowerCase()) . There are quite a few pairs. for example

 s1=0049 'LATIN CAPITAL LETTER I' s2=0131 'LATIN SMALL LETTER DOTLESS I' s1.lowercase = 0069 'LATIN SMALL LETTER I' s2.lowercase = 0131 itself

It also depends on the language: for s1, Turkish and Azerbaijani use U + 0131 for lower case (see http://www.fileformat.info/info/unicode/char/0049/index.htm )

+1

ZhongYu Mar 26 '13 at 4:09

source share

You are right to worry. Read the contract for equalsIgnoreCase .

Two characters c1 and c2 are considered the same ignoring case, if at least one of the following is true:

Two characters are the same (compared to the == operator)
Applying the Character.toUpperCase (char) method to each character, we get the same result
Applying the Character.toLowerCase (char) method to each character, we get the same result

So, if there is a character equal when converting to uppercase, but not vice versa, you will have problems.

Take the example of the German character ß , which turns into a two-character sequence SS when converting to uppercase. This means that the strings ß and SS are equalsIgnoreCase but will not have the same representation when converting to lowercase!

So your approach is broken here. Unfortunately, I'm not sure that you can create a hash code that adequately expresses your need here.

+1

Steven schlansker Mar 26 '13 at 4:10

source share

In terms of writing hashCode() consistent with equals() , you should either use Character -based case-mapping in both cases, or in the case of String -based case-mapping. In my other answer, I showed how to write hashCode() using case matching based on Character ; but there is another solution, which is to use String phase mapping of cases instead of equals() . (Note that String.equalsIgnoreCase() uses Character phase mapping of cases.)

 @Override public boolean equals(Object o) { if (o == this) return true; if (! (o instanceof Player)) return false; Player p = (Player) o; return getFirstName().toLowerCase().equals(p.getFirstName().toLowerCase()) && getLastName().toLowerCase().equals(p.getLastName().toLowerCase()); }

+1

Robert Tupelo-Schneck May 08 '13 at 14:53

source share

Robert Tupelo-Schneck · Accepted Answer · 2013-05-07T14:57:23+0000

 @Override public int hashCode() { int result = 17; result = 31 * result + characterwiseCaseNormalize(getFirstName()).hashCode(); result = 31 * result + characterwiseCaseNormalize(getLastName()).hashCode(); return result; } private static String characterwiseCaseNormalize(String s) { StringBuilder sb = new StringBuilder(s); for(int i = 0; i < sb.length(); i++) { sb.setCharAt(i,Character.toLowerCase(Character.toUpperCase(sb.charAt(i)))); } return sb.toString(); }

This hashCode will match the equals defined by equalsIgnoreCase . Basically, according to the equalsIgnoreCase contract, this seems to rely on the fact that

 Character.toLowerCase(Character.toUpperCase(c1))==Character.toLowerCase(Character.toUpperCase(c2))

whenever

 Character.toLowerCase(c1)==Character.toLowerCase(c2).

I have no evidence that this is true, but the OpenJDK implementation of equalsIgnoreCase actually does this sequentially with this method; it checks if the corresponding characters are equal, that is, their versions in upper case are equal, that is, the versions in lower case of upper case versions are equal.

Overriding hashCode with overridden equal using equalsIgnoreCase to check equality

More articles: