In HashMap, why is the threshold value (the next size value for resizing) - load factor * capacity. Why not as the size or capacity of the card

In HashMap, why the threshold value (the next size value for resizing) is the capacity factor * load . Why can not it be equal to the size or capacity of the card .

For example, the default initial capacity = 16, load factor = 0.75, and therefore threshold = (capacity * load factor) = (16 * 0.75) = 12 .

Resizing the map when adding the 13th element, why is this so, why did the author of the map decide to keep its capacity * load factor (which is 12)? Why not the same as capacity (which equals 16).

why not keep the threshold value equal to capacity so that re-writing is performed only when the hashmap is full?

+8
java collections
source share
4 answers

Javadok, Javadok, Javadok. This is the first place you look. The HashMap says:

Typically, the default load factor (.75) offers a good compromise between time and space. Higher values ​​reduce overhead but increase the cost of the search (reflected in most HashMap operations, including get and put). the expected number of entries on the map and the load factor when establishing its initial capacity, in order to minimize the number of paraphrase operations. If the initial capacity is greater than the maximum number of records divided by the load factor, no re-evacuation operations will ever occur.

As with the hash theory, if your card is full, then you are doing something very, very wrong. By then, you're probably in O(sqrt(N)) when searching with random data - BAD. You never want your hash file to be populated. But a very rare map will spend too much space (as you noticed), and iteration will take too long. Therefore, for most use cases there should be a load factor that is less than 1.

Note: "Empty space" is proportional to the size of the card and inversely proportional to the load factor. However, search time has a more complex expected performance feature. This means that the same load factor will not work for different hash cards of different sizes, as this will mean different trade-offs between the scales.


A general overview of trade-offs can be found in Knuth's book, The Art of Programming, vol 3.

+6
source share

From a theoretical point of view, the likelihood of not supporting collisions with the full hash table is very low, so the hash tables will be modified to preserve the desired O(1) search property - fewer collisions mean more direct access to records and less search.

+4
source share

The HashMap implementation allows you to set the load factor. This constructive solution gives the class user some measure of control over the conditions under which the underlying data structure changes.

The default load factor of 0.75 was probably chosen as a reasonable balance between memory usage and card performance (determined by collision speed and changes in overhead).

For any given instance of HashMap you can choose the appropriate load factor for your specific situation. You must consider the relative importance of a small amount of memory, both the sensitivity to performance for search queries, and how the sensitivity to performance that you do for put (to put, which leads to a map rebuild, can be very slow).

Aside, your concept of a “full” HashMap is a bit skewed. The implementation handles an arbitrary number of collisions just fine (although there is a cost of execution for collisions). You can use a HashMap with a load factor of 1 billion, and it (probably) will never grow above 16.

There is no problem setting the load factor to 1.0, which will result in a rephrase operation when adding the 17th element to the HashMap by default. Compared to 0.75, you will use a little less space, do fewer repetitions and a few more collisions (and thus searching using equals() in a linked list).

+1
source share

In java 8, when a threshold is reached, than the contents of the bucket comes from using the links between the object and the balanced tree, which improves performance from O (n) to O (log n). This is one of the features in java 8, sometimes you need to remember

+1
source share

All Articles