Is there a performance issue in Java Overriding hashCode ()?

If I override the hashCode() method, this will degrade the performance of the application. I override this method in many places in my application.

+7
java
source share
8 answers

Yes, you can degrade the performance of a hashed collection if the hashCode method is poorly implemented. The best implementation of the hashCode method is to generate a unique hash code for unique objects. The unique hashCode avoids collisions, and the element can be saved and found using O(1) complexity. But only the hashCode method cannot do this, you need to override the equals method as well to help the JVM.

If the hashCode method is not able to generate a unique hash for unique objects, there is a possibility that you will contain more than one object in a bucket. This will happen when you have two elements with the same hash, but the equals method returns false for them. Therefore, every time this happens, the item will be added to the list in the hash bucket. This will slow down both insertion and return of elements. This will lead to O(n) complexity for the get method, where n is the size of the list in the bucket.

Note. . When trying to create a unique hash for unique objects in the implementation of the hash code, make sure that you write a simple algorithm for this. If your hash generation algorithm is too heavy, you are likely to see poor performance for operations in your hashed collection. Because the hashCode method is called for most hash collection operations.

+6
source share

This would improve performance if the correct data structure was used in the right place,

For example: a correct implementation of hashcode in Object can almost convert O (N) to O (1) for a HashMap lookup

if you don’t perform too complicated operation in hashCode() method

It will reference the hashCode() method every time it has to deal with the Hash data structure with your object, and if you have a heavy hashCode() method (which should not be)

+3
source share

It all depends on how you implement hashCode . If you are doing a lot of expensive deep operations, maybe, perhaps, in which case you should consider caching a copy of hashCode (e.g. String ). But a decent implementation such as HashCodeBuilder will not be a big problem. With a good hashCode value, you can search much faster in data structures like HashMap and HashSet , and if you override equals , you need to override hashCode .

+3
source share

Java hashCode() in any case a virtual function, so there is no performance loss due to the fact that it is overridden and an overridden method is used.

The real difference may be the implementation of the method. By default, hashCode() works like this ( source ):

As far as reasonably practical, the hashCode method defined by class Object returns different integers for different objects. (This is usually implemented by converting the internal address of the object to an integer, but this implementation technique is not required by the JavaTM programming language.)

Thus, whenever your implementation is as simple as this, there will be no performance loss. However, if you perform complex computational operations based on many fields, causing many other functions, you will notice a loss of performance, but only because your hashCode() does more things.

There is also the problem of inefficient hashCode() implementations. For example, if your hashCode() just returns 1 , then using a HashMap or HashSet will be significantly slower than with a proper implementation. There is a big question that covers the topic of implementing hashCode() and equals() on SO: What issues should be considered when overriding peers and hashCode in Java?

One more note: remember that whenever you implement hashCode() , you must also implement equals() . Moreover, you should do this with caution, because if you write an invalid hashCode() , you can break equality checks for different collections.

+3
source share

Overriding hashCode () in a class by itself does not cause any performance problems. However, when an instance of such a class is inserted into either the HashMap HashSet or the hashCode () equivalent data structure, and if necessary, the equals () method is called to identify the correct bucket for the input element. The same applies to Retrival Search and Deletion.

As pointed out by others, performance is completely dependent on how hashCode () is implemented. However, if the class method is equal to the method, it is not necessary to redefine equals () and hashCode (), but if equals () is overridden, hashcode () must also be redefined

+2
source share

Like all previous comments, the hash code is used for hashing in collections or can be used as a negative condition on equal terms. So yes, you can slow down your application. Obviously, there are more use cases.

First of all, I would say that the approach (regardless of whether to rewrite it at all) depends on the type of objects you are talking about.

  • The default implementation of the hash code is as fast as possible, since it is unique for each object. This is possible for many cases.
  • This is not good if you want to use hashset and let them choose not to store two identical objects in the collection. Now the dot is in the "same" word.

“Same” may mean “same instance”. “Same” can mean an object with the same (base) identifier, when your object is an object or “the same” can mean an object with all equal properties. It seems that this may affect performance.

But one of the properties can be an object that can also evaluate hashCode () on demand, and right now you can always get the hash code rating of the object tree when you call the hash code method on the root object.

So what would I recommend? You need to identify and clarify what you want to do. Do you really need to distinguish between different instances of objects, or is the identifier crucial, or is it an object of value?

It also depends on immutability. You can calculate the hashcode value once when the object is built using all the properties of the constructor (which only gets) and is always used when hashcode () is a call. Or another option is to always evaluate hashcode when any property receives changes. You need to decide whether most cases read the value or write it.

The last thing I would say is to override the hashCode () method only when you know that you need it, and when you know what you are doing.

+1
source share

Overriding the hashCode () method will degrade application performance. This will improve performance if the right data structure is used in the right place.

For example: the correct implementation of hashcode () in Object can almost convert O (N) to O (1) to search for a HashMap. If you are not performing too complex an operation in the hashCode () method

0
source share

The main purpose of the hashCode method is to allow an object to be a key in a hash map or a member of a hash set. In this case, the object must also implement the equals (Object) method, which is consistent with the hashCode implementation:

 If a.equals(b) then a.hashCode() == b.hashCode() 

If hashCode () has been called twice for the same object, it must return the same result, provided that the object has not been modified

hashCode in terms of performance

  • In terms of performance, the main goal of implementing the hashCode method is to minimize the number of objects sharing the same hash code.
  • All hash-based JDK collections store their values ​​in an array.
  • The hash code is used to calculate the starting position of the search in this array. After that, the equals value is used to compare the set value with the values ​​stored in the internal array. Thus, if all values ​​have different hash codes, this minimizes the probability of hash code collisions.
  • On the other hand, if all values ​​have the same hash code, the hash map (or set) will be expanded into a list with operations on it having complexity O (n2).
  • Starting with Java 8, although the collision will not affect performance as much as in earlier versions, because after the threshold the linked list will be replaced by a binary tree, which will give you O (logN) performance in the worst case compared to O (n ) linked list.
  • Never write a hashCode method that returns a constant.
  • The distribution of the results of String.hashCode is almost perfect, so you can sometimes replace strings with their hash codes.

The next goal is to check how many identifiers with non-unique codes you still have. Improve the hashCode method or increase the range of valid hash code values ​​if you have too many non-unique hash codes. Ideally, all your identifiers will have unique hash codes.

0
source share

All Articles