How does the Failure function used in the KMP algorithm work?

Question

How does the Failure function used in the KMP algorithm work?

I tried my best to read most of the literature on this subject and still have not understood anything about how the failure function used in the KMP algorithm is built. I mainly mean the http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=stringSearching tutorial, which most people find great. However, I still do not understand this. I would be grateful if you could take the pain on yourself to give me a simpler and more understandable explanation.

+4

string algorithm knuth-morris-pratt

hytriutucx Jul 01 '12 at 9:31

source share

2 answers

http://www.oneous.com/Tutorial-Content.php?id=24

U can use the learning resources on this website to understand the KMP algorithm and rejection function. Also try to take the code and do a few runs on it for an example line manually. However, the best way to understand its operation would be to encode it on some variants of the basic algorithm. I suggest starting with NHAY and PERIOD at SPOJ.

-2

redoc Jul 03 '12 at 9:59

source share

Dominik gleich · Accepted Answer · 2012-07-01T13:14:48+0000

The crash function actually tells us this: if you match X characters of a string, which is the longest suffix of such a string, which is also a search string prefix.

You ask how it is built, the approach is quite simple.

If you add a new character at the end of the line, then you build f [x], and if it matches the character at position f [x-1], then f [x] is just f [x-1] + 1.

In other cases, when it does not match, you try to find smaller and smaller suffixes and check if they match.

For example, you have the word "accadaccac" for which you create a crash function, and you just added the letter 'c' . Say you are building a failure function for the last letter, the letter 'c' .

First you check the failure function of the previous letter, its failure function is 4, because you can match the suffix "acca" with the prefix "acca" , and now you add the letter 'c' , it does not match the letter 'd' prefix prefix "acca"
So, you return to the last good suffix. You are now looking for the suffix "acca" , which is also the prefix "accadaccac" , but less than "acca". The answer to this question is f [length ("acca") - 1] or f [3], which f [3] = 1, since the suffix of length 1 (the letter letter 'a' ) is also the prefix of the search string.
And now you can try if 'c' matches the character at position 1 and voila matches, so now you know f [9] = f [f [8] -1] +1 = 2.

Hope this helps you. Good luck! :)

How does the Failure function used in the KMP algorithm work?

More articles: