There are many strategies for this.
Idea 1
Take the line you are looking for and make a copy of each possible substring, starting from some column and continuing the entire line. Then save each in an array indexed by the letter it starts with. (If a letter is used, save the longer substring twice.
So the array looks like this:
a - substr[0] = "astringthatmustbechecked" b - substr[1] = "bechecked" c - substr[2] = "checked" d - substr[3] = "d" e - substr[4] = "echecked" f - substr[5] = null // since there is no 'f' in it ... and so forth
Then, for each word in the dictionary, find the element of the array indicated by its first letter. This limits the number of things to look for. In addition, you can never find a word starting with, say, “r,” anywhere before the first “r” in a line. And some words will not even search if there is no letter at all.
Idea 2
Expand this idea by marking the longest word in the dictionary and get rid of the letters from these lines in arrays that are longer than this distance.
So you have this in an array:
a - substr[0] = "astringthatmustbechecked"
But if the longest word in the list is 5 letters, there is no need to store more than:
a - substr[0] = "astri"
If the letter is present several times, you must store more letters. Thus, it is necessary to save the whole line, because "e" continues to be displayed less than 5 letters.
e - substr[4] = "echecked"
You can expand this by using the longest words starting with any particular letter when condensing strings.
Idea 3
This has nothing to do with 1 and 2. Its an idea that you could use instead.
You can turn a dictionary into a kind of regular expression stored in a related data structure. You can also write a regular expression and then apply it.
Suppose these are words in a dictionary:
arun bob bill billy body jose
Create a similar structure. (His binary tree is indeed represented in such a way that I can explain how to use it.)
a -> r -> u -> n -> * | b -> i -> l -> l -> * | | | | o -> b -> * y -> * | | | d -> y -> * | j -> o -> s -> e -> *
The arrows indicate the letter that should follow another letter. Therefore, "r" must be after "a" or cannot match.
Lines going down indicate an option. You have “a or b or j” possible letters, and then “i or o” possible letters after “b”.
The regular expression looks like: / (arun) | (b (ill (y +)) | (o (b | dy))) | (jose) / (although I might have slipped). This makes sense to create it as a regular expression.
Once you create this structure, you apply it to your row, starting from the first column. Try to complete the match by checking the alternatives, and if one of them matches, more forward, and try the letter after the arrow and its alternatives. If you reach a star / sprocket, it will match. If you have run out of alternatives, including the inverse, you move on to the next column.
This is a lot of work, but sometimes it can be convenient.
Side note. I built one of them some time ago by writing a program that wrote code that ran the algorithm directly, instead of having code looking at the data structure of the binary tree.
Consider that each set of vertical panel options is a switch for a particular character column, and each arrow turns into a nesting box. If there is only one option, you do not need the full switch , just if .
It was a quick match of characters and really useful for some reason that eludes me today.