Dictionary data structure that uses an input field as a key

Data about data that is necessary in any way must be indexed using a unique key. Usually it looks something like this (I use C # because it is the language I like most, but this question is not specific to it):

public class NamedRecord { public readonly string UniqueImmutableName; ... } public class UsesUsualDict { Dictionary<string, NamedRecord> myDict = new Dictionary<string, NamedRecord>(); void AddRecord(NamedRecord _NewRecord) { myDict[_NewRecord.UniqueImmutableName] = _NewRecord; } NamedRecord GetRecord(string _Key) { return myDict[_Key]; } } 

However, this seems a bit redundant: the keys in this dictionary should always be considered equal to NamedRecord.UniqueImmutableName , but the developer finishes saving this relationship on his own. In addition, I simply do not like such duplication of data.

Sometimes I see a solution like this: a data record does not even have a UniqueImmutableName as its member. For example, in a textbook for the D language , Andrei Alexandrescu uses a data structure that displays the statistics of words in the game:

 struct PersonaData { uint totalWordsSpoken; uint[string] wordCount; } 

But the name of the symbol is not even in it - it is available only as a key to the dictionary in which these structures are stored. Beyond this context, this data structure is almost useless.

I want to do something like this:

 public class UsesLambdaDict { LambdaDictionary<string, NamedRecord> myDict = new LambdaDictionary<string, NamedRecord>(NamedRecord _Record => _Record.UniqueImmutableName); void AddRecord(NamedRecord _NewRecord) { myDict.Add(_NewRecord); } NamedRecord GetRecord(string _Key) { return myDict[_Key]; } } 

It seems to me that this way of processing such records is better, because the connection between the NamedRecord UniqueImmutableName member and the dictionaries it stores is formed at compile time. The drawback that I see is that there is no way to guarantee that this lambda will is a pure function, at least in C #. I really donโ€™t know D so well, but it seems that the pure keyword cannot guarantee this.

So, I have a few questions about this:

  • Is this even a real problem? The disadvantages of the first solution that I have are somewhat theoretical - maybe there is nothing wrong with that.
  • What are other possible ways to solve it?
  • What are the other possible disadvantages of the proposed solution?
+6
source share
2 answers

1.) I donโ€™t think so.

2.) Use a structure with key and value, I don't think lambdas are necessary

3.) Performance problems (cache misses)

+1
source

1 - Overhead is not high enough where to worry. Since a string is a reference type and immutable, you do not save two strings, but only an additional reference to the same string (or any other key).

Of course, you can write a dictionary so as not to store the key and value separately, but only the value (and then look for the key in this object). However, at best, you save (PtrSize (4 or 8 bytes) * array size), which usually should be pretty small. In addition, you can get an indirect appeal when you go to a stored object to find where to look for a key value (which may be elsewhere in the memory), which leads to a cache miss.

If you want to flip your own vocabulary to do this, check out this free e-book:

http://www.syncfusion.com/resources/techportal/ebooks/datastructurespart2

They go through a hash, so this should be pretty easy. Alternatively, you can simply copy the source code of the project.

+1
source

All Articles