What are the fastest possibilities for a unique, unordered set of unique reading lines?

Disclaimer: I understand that the very obvious answer to this question is HashSet<string>. It is absurdly fast, it is disordered, and its meanings are unique.

But I'm just curious, because HashSet<T>- a mutable class, so it has Add, Removeetc .; and therefore I'm not sure that the underlying data structure that makes these operations possible brings certain performance sacrifices when it comes to read operations - in particular, it bothers me Contains.

Basically, I wonder what the absolute fastest -processing data structures exist that a method can provide Containsfor objects of type string. Inside or outside the .NET platform itself.

I am interested in all kinds of answers, regardless of their limitations. For example, I can imagine that some structure can be limited to lines of a certain length or can be optimized depending on the problem area (for example, the range of possible input values), etc. If it exists, I want to hear about it.

Last: I do not limit this to read-only. Obviously, any read and write data structure can be embedded in a read-only shell. The only reason I even mentioned the word β€œread-only” is that I have no requirements for a data structure that allows you to add, delete, etc. If she has these features, I will not complain.


UPDATE

- , . A Trie * : HashSet<T>.Contains GetHashCode IEqualityComparer<string>, , , O (n) ** .NET. , HashSet<string>.Contains, true, false. Trie, true O (n) ; false .

, , . Trie .NET, HashSet<string> Contains ( , , "a" - "z" ). , .


** "n" .
+5
4

Tries Contains, . s trie - O (| s |) (| s | = s), .

+2

, Hashset - .

, Hashtable O (1) --

+1

O (1) , .

- : - . - ( ) , , -.

, -, , . , . , , , , , , , .

+1

O (1) . , , O (1/n) - . , :

  • , . O (n). , . String.GetHashCode() .
  • , , . -, . O (n) , . (, table = new HashSet (table);)

. ( ), , .

+1

All Articles