Hash table implementation

I am trying to create an efficient lookup table in C.

I have an integer as a key and a variable length char* as a value.

I looked through uthash , but this requires a fixed length char* value. If I make this a large number, then I use too much memory.

 struct my_struct { int key; char value[10]; UT_hash_handle hh; }; 

Does anyone have any pointers? Any insight was greatly appreciated.


Thanks to everyone for the answers. I set off with uthash and defined my own structure for hosting my data.

+8
c hashtable
source share
3 answers

Declare the value field as void *value .

Thus, you can have any data type as a value, but the responsibility for its distribution and release will be delegated to the client code.

+5
source share

First you need to think about your collision strategy:

  • Will you have several hash functions?
  • Or do you have to use containers inside a hash table?

We will choose 1.

Then you need to select a well-distributed hash function. For example, we will choose

 int hash_fun(int key, int try, int max) { return (key + try) % max; } 

If you need something better, maybe look at the middle square method.

Then you have to decide what the hash table is.

 struct hash_table { int max; int number_of_elements; struct my_struct **elements; }; 

Then we need to determine how to insert and extract.

 int hash_insert(struct my_struct *data, struct hash_table *hash_table) { int try, hash; if(hash_table->number_of_elements >= hash_table->max) { return 0; // FULL } for(try = 0; true; try++) { hash = hash_fun(data->key, try, hash_table->max); if(hash_table->elements[hash] == 0) { // empty cell hash_table->elements[hash] = data; hash_table->number_of_elements++; return 1; } } return 0; } struct my_struct *hash_retrieve(int key, struct hash_table *hash_table) { int try, hash; for(try = 0; true; try++) { hash = hash_fun(key, try, hash_table->max); if(hash_table->elements[hash] == 0) { return 0; // Nothing found } if(hash_table->elements[hash]->key == key) { return hash_table->elements[hash]; } } return 0; } 

And the smallest removal method:

 int hash_delete(int key, struct hash_table *hash_table) { int try, hash; for(try = 0; true; try++) { hash = hash_fun(key, try, hash_table->max); if(hash_table->elements[hash] == 0) { return 0; // Nothing found } if(hash_table->elements[hash]->key == key) { hash_table->number_of_elements--; hash_table->elements[hash] = 0; return 1; // Success } } return 0; } 
+15
source share

It really depends on the distribution of your key field. For example, if this unique value is always between 0 and 255 inclusive, simply use key % 256 to select the bucket and you have the perfect hash.

If it is evenly distributed over all possible int values, any function that gives you a uniformly distributed hash value will do (for example, the above key % 256 ), albeit with multiple values ​​in each bucket.

Without knowing the distribution, it is a little difficult to talk about efficient hashes.

+5
source share

All Articles