I have an array of words that need to be executed for the find-and-replace operation on a regular expression, and sometimes this array can be thousands of words. I tested and found that using words using common prefixes is much faster than finding them individually. That is, it ^where|why$works slower than ^wh(ere|y)$. Obviously, this is not a noticeable difference in such a brief example, but much faster when there are thousands of alternatives and the subject line is long.
So I'm looking for a way to do this automatically, for example, to convert string[] { "what", "why", "where", "when", "which" }towh(at|y|e(re|n)|i(ch))
Is there an already recognized algorithm that does this? If not, how would you do it? It seems that this needs to be done recursively, but I cannot figure out how to do it. I have a method that I wrote that works to a limited extent, but it is inelegant, lasts 60 lines and uses several nested foreach loops to make it a nightmare in the future. I am sure there is a much better way if someone can point me in the right direction, which would be very appreciated ...
mikel source
share