Algorithms for quickly approximating strings

Question

Algorithms for quickly approximating strings

Given the source line sand nlines of equal length, I need to find a quick algorithm to return those lines that have no more kcharacters than the original line sfor each corresponding position.

What is a fast algorithm for this?

PS: I have to argue that this is a question academic. I want to find the most efficient algorithm, if possible.

I also missed one very important information. Lines of equal length nform a dictionary against which many source lines will be queried s. There seems to be some kind of preprocessing step to make it more efficient.

+2

string algorithm

Qiang Li May 03 '13 at 4:34

source share

5 answers

Kylem · Answer 1 · 2013-05-03T04:41:54+0000

My gut instinct is simply to iterate over each line n, maintaining a count of the number of characters other than s, but I do not claim that this is the most effective solution. However, this will be O (n), so if this is not a known performance issue or an academic issue, I would go with that.

MBo · Answer 2 · 2013-05-03T06:21:23+0000

Sedgewick in his book "Algorithms" writes that the Ternary search tree allows " to find all words within a given Hamming distance of a query word ." An article in Dr. Dobbe

Zim-Zam O'Pootertoot · Answer 3 · 2013-05-03T05:07:43+0000

, , , ; O (n) . , O (nm) m .

, , , - ; - (p, c), p - , c - , - , ( "the" {(0, 't'), "the" }, {(1, h '), "the" }, {(2,' e '), "the" }). , ; - , - , ( "the" "" 2, "tee" 1). , , K.

, , K, . , K 5, N 8, , 4-8 , , , 5 . , 6- , , 3.

, NoSql - , ( , ).

, (p, c) , ( (5, 't') "5t" (12, 'x') "12x" ).

Kyle Strand · Answer 4 · 2013-05-03T04:44:36+0000

, , , . - . i - , false, i == k true, k-i .

, , , , , , , . , , .

glh · Answer 5 · 2013-05-04T06:07:50+0000

, : P , n, , . , n, .

, n n' s s'.

s' , n', s'. n' s', n'. . , k n .

For further consideration, added preprocessing can be performed for each adjacent line in nto see the total number of characters that differ. This could be used when comparing strings nand s, and if nthere is enough difference between them and the neighboring one, maybe there is no need to compare it?

Algorithms for quickly approximating strings

More articles: