In one of the applications I'm working on, you need to have this function:
bool IsInList(int iTest) { //Return if iTest appears in a set of numbers. }
The list of numbers is known when downloading the application (but they are not always the same between two instances of the same application) and will not change (or be added) throughout the program. Integers themselves can be large and have a large range, so it is inefficient to have vector<bool>. Performance is a problem because the function is in a hot spot. I heard about Perfect Hashing , but could not find any good advice. Any pointers would be helpful. Thank you
vector<bool>
ps Ideally, if the solution is not a third-party library, because I cannot use them here. Something simple enough to be understood and implemented manually would be great if it were possible.
I would suggest using Bloom Filters in combination with simple std::map.
std::map
Unfortunately, the flower filter is not part of the standard library, so you have to implement it yourself. However, this turns out to be a fairly simple structure!
A Bloom filter is a data structure that specializes in the question: is this element part of a set, but does it with an incredibly tight memory requirement and is pretty fast.
, ... : ?
, , ( ), , ...
, , , , .
Bloom Filter -. , . , gcc .
- . , , -. .
. log (N) . , .
gperf. , . , -.
, ,
http://videolectures.net/mit6046jf05_leiserson_lec08/
, 49:38, , . - Dot Product , . - . , -, FAST , SEED . , -.
@54: 30 . . . (!)
, .
, , , , , .
std:: map 99,9%. iTest , .
Int - , :
bool hash[UINT_MAX]; // stackoverflow ;)
. , .
Wikipedia , , ++.
N N , .. - . , (, 1 20) --, (, 1 4), ( , ) , . - , ( ). () X% , "--"....
, , , , , , - . , " ", , % -ed . . , , .
, -, , , , .
, L H . R = H - L + 1. .
H L + 1, , , .
, , , . , .
, , ... - , , .
It is possible that a trie or perhaps a Van Emde Boas tree might be the best bet for creating a spatially efficient set of integers with search time that will constantly increase with the number of objects in the data structure, assuming that even std :: bitset will be too large.