String Transposition Algorithm

Suppose two lines are given:

String s1= "MARTHA" String s2= "MARHTA" 

here we exchange the positions of T and H. I am interested in writing code that calculates how many changes need to be converted from one line to another.

+5
source share
3 answers

Assuming that the distance only takes into account swaps, here is an idea based on permutations that runs in linear time.

The first step of the algorithm is to ensure that the two lines are truly equivalent in their character content. This can be done in linear time using a hash table (or a fixed array that spans the entire alphabet). If this is not the case, then s2 cannot be considered a permutation of s1, and the value of "swap count" does not matter.

The second step counts the minimum number of swaps needed to convert s2 to s1. This can be done by checking the permutation p corresponding to the transformation from s1 to s2. For example, if s1 = "abcde" and s2 = "badce", then p = (2,1,4,3,5), which means that position 1 contains element # 2, position 2 contains element # 1, etc. d. This permutation can be broken down into linear permutation cycles . Cycles in the example: (2,1) (4,3) and (5). The minimum number of swaps is the total number of swaps required for each cycle. Cycle k requires k-1 swaps to β€œfix”. Therefore, the number of swaps is NC, where N is the line length and C is the number of cycles. In our example, the result is 2 (swap 1.2, and then 3.4).

Now there are two problems, and I think I'm too tired to solve them right now :)

1) My decision assumes that the symbol does not repeat, which is not always the case. For correct calculation of the amount of swap, some adjustment is necessary.

2) My formula # MinSwaps = NC needs proof ... I did not find it on the Internet.

+3
source

There are several edit distance algorithms; this Wikipeida link has links to several.

+6
source

Your problem is not so simple, since before counting the swaps you need to make sure that each swap reduces the "distance" (by equality) between these two lines. Then you are actually looking for an account, but you should look for the smallest amount (or, at least, I suppose), otherwise there are endless ways of exchanging a string to get another.

You must first check which characters are already set, and then for each character that does not look if there is a pair that you can swap to reduce the spacing between the lines. Then iterate to complete the process.

If you do not want to do this efficiently, but simply count the number of swaps, use a bit array in which you have 1 for each well-placed character and 0 otherwise. You will end when each bit is 1 .

0
source

All Articles