What does it mean that a hash table is open in Java?

I read the Java api docs in the Hashtable class and came across a few questions. The document says: " Please note that the hash table is open: in the case of a" hash collision ", several records are stored in one bucket, which must be searched sequentially. " I tried the following code myself

Hashtable<String, Integer> me = new Hashtable<String, Integer>(); me.put("one", new Integer(1)); me.put("two", new Integer(2)); me.put("two", new Integer(3)); System.out.println(me.get("one")); System.out.println(me.get("two")); 

out out was

 1 3 
  • Is that what open means?
  • What happened to Integer 2? garbage collected?
  • Is there a "closed" example?
+6
java hashtable hash-collision
source share
6 answers

No, this is not what is meant by "open."

Note the difference between a key collision and a hash collision.

More than one entry with the same key will not be allowed in the Hashtable (as in your example, you put two entries with the "two" key, the second (3) replaces the first (2), and you were left with only the second in the Hashtable).

Hash collision is when two different keys have the same hash code (as returned by the hashCode () method). Different hash table implementations can relate to this differently, mainly from a low-level implementation point of view. Being “open”, the Hashtable will store a linked list of records whose key hashes have the same value. This can lead in the worst case to O (N) performance for simple operations, which would normally be O (1) in a hash map, where the hashes were mostly different values.

+11
source share

This means that two elements with different keys that have the same hash code fall into the same bucket .

In your case, the two keys are the same, so the second position overwrites the first.

But assuming you have your own class

 class Thingy { private final String name; public Thingy(String name) { this.name = name; } public boolean equals(Object o) { ... } public int hashcode() { //not the worlds best idea return 1; } } 

And created several instances. i.e.

 Thingy a = new Thingy("a"); Thingy b = new Thingy("b"); Thingy c = new Thingy("c"); 

And inserted them into the card. Then one bucket, that is, a bucket containing material with a hash code of 1, will contain a list (chain) of three elements.

 Map<Thingy, Thingy> map = new HashMap<Thingy, Thingy>(); map.put(a, a); map.put(b, b); map.put(c, c); 

Thus, retrieving an item using any Thingy key will search for the hash code O (1), followed by a linear search for O (n) in the list of items in the bucket with hash code 1.

Also, be careful to make sure that you respect the correct relationship when implementing hashcode and equals. Namely, if two objects are equal, then they must have the same hascode, but not necessarily otherwise, since several keys can receive the same hash code.

Oh, and for complete definitions of open hashing and closed hash tables, see here http://www.c2.com/cgi/wiki?HashTable

+3
source share

This means that the Hashtable uses open hashing (also known as a separate chain) to deal with hash collisions. If two separate keys have the same hash code, both of them will be stored in the same bucket (in the list).

+2
source share

Open means that if two keys are not equal but have the same hash value, then they will be stored in the same bucket. In this case, you can think of each bucket as a linked list, so if many things are stored in the same bucket, search performance will decrease.

Bucket 0: Nothing Found Bucket 1: Point 1
Bucket 2: Item 2 → Item 3
Bucket 3: Nothing Found Bucket 4: Number 4

In this case, if you are looking for a key containing hashes in bucket 2, you need to search O (n) in the list to find the key equal to what you are looking for. If key hashing in Bucket is 0, 1, 3 or 4, you get O (1) search performance.

+2
source share

A hash is a computed function that maps a single object ("one" or "two" in your example) to (in this case) an integer. This means that there can be several values ​​that map to the same integer (an integer has a finite number of valid values, while there can be an infinite number of inputs). In this case, the "equal" should be able to talk about all these two. So your sample code is correct, but there may be another key that has the same hash code (and will be placed in the same bucket as the “two”)

+1
source share

Warning : there are conflicting definitions of “open hashing” in general use:

A quote from http://www.c2.com/cgi/wiki?HashTable is quoted in another answer:

Note: some people use the term “open hashing” to mean what I have here called “closed hash”! the usage here matches that in TheArtOfComputerProgramming and Introduction. Algorithms as recommended if you want to know more about the hash of the table.

For example, the page above defines "open hashing" as follows:

There are two main strategies. open hashing, also called open addressing, says: when you need an entry in the table for a new key / value pair is already taken, find another unused entry somehow and put it there. Closed hashing says: each record in the table is a secondary data structure (usually a linked list, but there are other possibilities) containing actual data, and this data structure can be without restrictions.

Unlike the definition provided by Wikipedia :

In a strategy known as a single chain, direct connection, or just chain, each bucket slot in an array is a pointer to a linked list that contains key-value pairs that are hashed in the same place. Look requires a list scan for a record with a given key. insertion requires adding a new record entry at either end of the list to the hashed slot. To delete, you need to search the list and delete the item. (This technique is also called open hashing or private addressing, which should not be confused with "open addressing" or "private" hashing.)

If the so-called "experts" cannot agree with what the term "open hashing" means, it is best to avoid using it.

0
source share

All Articles