I am interested in implementing the Rabin-Karp algorithm for finding substrings, as stated in the wiki: http://en.wikipedia.org/wiki/Rabin-Karp_string_search_algorithm . Not for homework, but for personal interests. I implemented the Rabin-Karp algorithm (shown below) and it works. However, the performance is really, really bad !!! I understand that my hash function is basic. However, it seems that a simple call to strstr () always surpasses my rabin_karp () function. I can understand why - a hash function does more work than a simple char -by-char compares each cycle. What am I missing here? Should the Rabin-Karp algorithm be faster than calling strstr ()? When is the best time to use the Rabin-Karp algorithm? Hence my personal interests. Have I really implemented the algorithm?
size_t hash(char* str, size_t i) { size_t h = 0; size_t magic_exp = 1; // if (str != NULL) { while (i-- != 0) { magic_exp *= 101; h += magic_exp + *str; ++str; } } return h; } char* rabin_karp(char* s, char* find) { char* p = NULL; if (s != NULL && find != NULL) { size_t n = strlen(s); size_t m = strlen(find); if (n > m) { size_t hfind = hash(find, m); char* end = s + (n - m + 1); for (char* i = s; i < end; ++i) { size_t hs = hash(i, m); if (hs == hfind) { if (strncmp(i, find, m) == 0) { p = i; break; } } } } } return p; }
c ++ performance string algorithm rabin-karp
Pinkduck
source share