Why does a higher load factor in HashMap increase the cost of the search?

From JavaDoc HashMap :

Typically, the default load factor (.75) offers a good compromise between time and space. Higher values ​​reduce overhead, but increase the cost of the search (reflected in most operations of the HashMap class, including get and put).

If we have a higher value, why increase the cost of the search?

+7
source share
5 answers

Hash Table The load factor is defined as

n / s, the ratio of the number of stored records n and the size s of the array of code tables.

Hash table performance is maintained when the number of collisions is low. When the load factor is high, the probability of collisions increases.

+6
source

This is due to the way the HashTable is implemented under the hood, it uses hash codes, and since the hash code calculation algorithm is not perfect, you may encounter some collisions, increasing the load factor, increasing the likelihood of a collision, and therefore reduce search performance. ..

+2
source

Here we must first understand what the power factor and load ratio mean:

capacity: this is the number of buckets in any hash table at any given time.

load factor: the load factor is a measure of how much a complete hash table can be obtained before its capacity is automatically increased.

so that the load factor is more busy, the hash table can get to increase capacity.

  • now the best possible implementation of hashCode () is provided, only one value will go in one bucket here the cost of the search will be at least
  • in the worst case, all values ​​will go in the same bucket, and the search cost will be maximum
  • on average , it will certainly depend on the implementation of hashCode (), but another factor that will play here is the load factor , since the busier the collection, the greater the chance of a collision, and therefore, a higher load factor will increase the cost of searching in a non-ideal scenario.
+2
source

default load factor (0.75).

 If declare load factor as 1 the restructuring process only happens when number of elements are exceeding the capacity of the HashMap. 
0
source

The load factor 0.75 can be interpreted in this way using the formula (n / s, the ratio of the number of stored records n and the size s of the table array from the buckets.):

Suppose you have 75 values ​​that need to be stored in a hash table, and you have 100 empty blocks of blocks to store them, the collision probability is minimized here, and the load factor is 0.75.

Now suppose you have 75 values ​​to store and only 10 empty block blocks (load factor 7.5), this means that you will have a collision and use any conflict resolution methods, which will increase the search time.

Now, in another way, you have 75 records and 1000 empty blocks of the array (load factor 0.075), this will lead to many empty blocks, which is a big loss of space.

Therefore, the rule of thumb is that as the load factor increases, your search time will increase, and as you approach 0, more storage space is lost.

Therefore, this is a spatial temporary work.

0
source

All Articles