What is the point of the hash table?

I have no experience working with hash tables outside arrays / dictionaries in dynamic languages, so I recently found out that they are internally implemented by creating a key hash and using this to store the value. I don’t understand why the values ​​stored with the key (string, number, whatever) are like a key, not a key, instead of doing a hash and storing it.

+11
hashtable
Feb 01 '10 at 20:49
source share
11 answers

This is the closest duplicate: Why do we use a hash code in a hash table instead of an index?

In short, you can check whether the key is already saved VERY quickly, and just as quickly save the new mapping. Otherwise, you will have to store a sorted list of keys, which is much slower to store and retrieve the mappings.

+8
Feb 01 '10 at 21:00
source share

I don’t understand why the values ​​stored with the key (string, number, all) are not the key

And how do you implement this?

Computers know only numbers. A hash table is a table, i.e. An array, and when we get to it, the array can only be processed through an integral non-negative index. Everything else is a hoax. Dynamic languages ​​that allow the use of string keys - they use trickery.

And one such deception, and often the most elegant one, simply calculates the numeric, reproducible number of the "hash" of the key and uses it as an index.

(There are other considerations, such as key range compression, but this is the main problem.)

+7
Feb 01 2018-10-01T00
source share

what is a hash table ?

It is also known as a hash map, a data structure used to implement an associative array. This is a structure that can map keys to values.

How it works?

The hash table uses a hash function to calculate the index into an array of buckets or slots from which to find the correct value.

See the diagram below for a clear explanation.

enter image description here

<strong> Benefits:

In a well-distributed hash table, the average cost of each search is independent of the number of items stored in the table.

Many hash table schemes also allow arbitrary insertion and deletion of key-value pairs.

In many situations, hash tables are more efficient than search trees or any other table search structure.

Disadvantages:

Hash tables are invalid when the number of records is very small. (However, in some cases, the high cost of computing a hash function can be reduced by storing the hash value with the key.)

Application:

They are widely used in many types of computer software, especially for associative arrays, indexing databases, caches, and collections.

+5
Aug 24 '13 at 10:21
source share

The idea of ​​a hash table is to provide direct access to its elements. That is why it calculates the "hash code" of the key and uses it to store the element embedded in the key itself.

The idea is to have only one hash code for each key. Many times, the hash function that generates the hash code is to split the prime and use its remainder as a hash code.

For example, suppose you have a table with 13 positions and an integer as a key, so you can use the following hash function

f (x) = x% 13

+2
Feb 01 2018-10-01T00
source share

Typically, a hash table point is to store some sparse value - i.e. There are a large number of keys and a small number of things to store. Think of the strings. There are countless possible lines. If you save the variable names used in the program, then there are a relatively small number of possible lines that you actually use, even if you do not know in advance what it is.

+1
Feb 01 '10 at 20:54
source share

I don’t understand why the values ​​stored with the key (string, number, whatever), like, well, the key, instead of having to hash this and save it.

Well, how do you suggest doing this using O (1) search?

The hashtables point basically provides O (1) searches by turning the key into the index of the array, and then returns the contents of the array at that index. To make this possible for any keys you need

  • A way to turn a key into an array index (this is a hash target)
  • A way to deal with collisions (keys that have the same hash code)
  • A way to adjust the size of an array when it is too small (causes too many collisions) or too large (losing space)
+1
Feb 01 '10 at 21:05
source share

In a nutshell: Hashing allows O (1) to query / insert / delete a table. OTOH, a sorted structure (usually implemented as a balanced BST) does the same operations as O (logn).

Why take a hash, you ask? How do you suggest storing the key "as a key"? Ask yourself, if you plan to store pairs (key, values), how fast will your searches / insertions / deletions be performed? Will you run O (n) loop through the whole array / list?

The whole point of having a hash value is that it allows you to convert all keys to a finite set of hash values. This allows us to store keys in slots of the final array (allowing quick operations - instead of searching the entire list, you look only for keys that have the same hash value), although the set of possible keys can be extremely large or infinite (for example, keys can be strings, very large numbers, etc.). With a good hash function, very few keys will have the same hash value, and all operations will be effectively O (1).

This probably doesn't make much sense if you are not familiar with hashing and how hash tables work. The best thing in this case is to consult the relevant chapter of the book with good algorithms / data structures (I recommend CLRS).

+1
Feb 01 '10 at 21:15
source share

In some cases, it is possible that the key is very long or large, which makes it impossible to save copies of these keys. Hashing them first of all allows you to reduce memory consumption, as well as speed up search time.

0
Feb 01 '10 at 20:50
source share

A hash table is used to store a set of values ​​and their keys in (for a while) a constant number of spots. In the simple case, suppose you want to store every integer from 0 to 10000 using the hash function i% 10.

This will make a hash table of 1000 blocks (often an array), each of which has a list of 10 elements. Therefore, if you want to find 1234, he will immediately know what to look for in the table entry for 123, and then start comparing to find the exact match. Of course, this is not much better than just using an array of 10,000 elements, but it's just for demonstration.

Hashtables are very useful when you do not know exactly how many elements you will have, but there will be much less collisions in the hash function than the total number of elements. (Which makes the hash function hash (x) = 0 "is very, very bad.) You might have empty spaces in your table, but ideally most of them will have some data.

0
Feb 01 '10 at 21:00
source share

Also consider speed. If your key is a string and your values ​​are stored in an array, your hash can access any element in a "near" constant time. Compare this to finding a string and its meaning.

0
Feb 01 '10 at 21:01
source share

The main advantage of using a hash to search for items in a table, as opposed to using the original key of a key-value pair (which BTW is usually stored in the table, since the hash is not reversible), is that ..

... it allows you to map the entire namespace [of the original] with a relatively small namespace of the hash values, allowing the hash table to provide O (1) performance for extracting elements.

This O (1) performance gets a little blurry when considering additional time to deal with collisions, etc., but in general, the hash table is very quickly used to store and retrieve elements, unlike a system based solely on [original], which would normally be O (log N), for example, a binary tree (although such a tree is more efficient, spatial)

0
Feb 01 2018-10-01T00
source share



All Articles