[EDIT: This post is just a complete explanation of the DP algorithm mentioned by Niklas B. in the comments on other posts.]
Testing a specific word in linear time
In a word that can be formed from the names of chemical elements, at least one of the following must be true:
- The last letter is the one-letter name of the chemical element. And the prefix consisting of all letters, with the exception of the last one, is in itself a word that can be formed from the names of chemical elements.
- The last two letters are the two-letter name of the chemical element. The prefix consisting of all letters, with the exception of the last two, is itself a word that can be formed from the names of chemical elements.
This indicates a good DP algorithm, where for a given word w of length k, we calculate for all 1 <= i <= k the largest number of letters (or chemical symbols, the question is ambiguous regarding exactly what we are trying to maximize!) From which we can construct a length prefix -i from w. Let me call it f (i) and say that f (i) = -1 if the prefix of length-iw cannot be created from the names of chemical elements at all.
Let X = 1 to maximize the names of the elements or 2 for the maximum number of letters.
Recursion for DP can be written as:
- one (i) = if (w [i] is the 1-letter name of the element) {f (i-1) + 1} else {-1}
- two (i) = if ("w [i-1] w [i]" is the 2-letter name of the element) {f (i-2) + X} else {-1}
- f (i) = -1 if i <0
- f (0) = 0
- f (i) = max (one (i), two (i)) if i> 0
Then f (k) will be what we want to know: the maximum number (letters / elements) from which w can be formed, or -1 if this is not possible.
The maximum value () means that if only one of the methods for processing the end of the prefix works, this method will be selected (since the other method will have a value of -1, which will always be less compared), and if none of the methods works, we will correctly get f (i) = -1.
Assuming constant testing of whether single characters or pairs of characters are real names of chemical elements (using, for example, arrays or hash tables), we can calculate whether a given word can be represented as a sequence of names of chemical elements in time and space in proportion to its length .
Given all the words
If we maximize the number of letters, then it probably makes sense to process the dictionary in decreasing order of length, since for the first time we come across a word for which f (k) is not -1, we know this should be the longest, and we can stop.
This can be adapted to maximize the number of element names. Although we cannot stop right away because it may be a shorter word, nevertheless it can be formed from more element names (in particular, using more single-character), we still get useful information whenever we find a new one a word with more elements than the previous one: we can update the cutoff threshold with this number of elements and stop as soon as we see a word that has fewer letters than this threshold, since a word with a length of -m can never be composed of more than m element names.
There is another approach that may turn out to be faster: first sorting the dictionary in alphabetical order, we can avoid rearranging f (i) in the prefix that is shared with the previous word. For example. if carton immediately follows cart in the dictionary, then we only need to calculate f (i) in the on part.