Quick string search?

I have a line vector and need to check if each element in the vector is present in the given list of 5000 words. Besides the mundane method of two nested loops, is there a faster way to do this in C ++?

+6
source share
4 answers

You should put the list of strings in std :: set . This is a data structure optimized for search. Searching if this element is in the set or not is an operation that is much faster than iterating through all the records.

When you are already using C ++ 11, you can also use std :: unordered_set , which is even faster to search because it is implemented as a hash table.

If it should be for a school / university: Be prepared to explain how these data structures handle faster. When your instructor asks you to explain why you used them, "some guys on the Internet told me" is unlikely to deserve a sticker in the textbook.

+7
source

You can put a list of words in std :: unordered_set . Then for each element in the vector you just need to check if it is in unordered_set in O (1). You would have the expected O (n) complexity (look at the comment to see why this was expected).

+3
source

You can sort the vector, then you can solve it with one “loop” (assuming your dictionary is also sorted), which means that O (n) does not take into account the cost of sorting.

+2
source

So, you have a line vector, with each line containing one or more words, and you have a vector that contains a dictionary, and you have to determine which words in the line vector are also in the dictionary? The line vector is an annoyance since you need to look at every word. I would start by creating a new vector, breaking each line into words and pushing each word into a new vector. Then sort the new vector and run it using the std::unique algorithm to eliminate duplicates. Then sort the dictionary. Then run both ranges with std::set_intersection to write the result.

+2
source

All Articles