Assuming that the distance only takes into account swaps, here is an idea based on permutations that runs in linear time.
The first step of the algorithm is to ensure that the two lines are truly equivalent in their character content. This can be done in linear time using a hash table (or a fixed array that spans the entire alphabet). If this is not the case, then s2 cannot be considered a permutation of s1, and the value of "swap count" does not matter.
The second step counts the minimum number of swaps needed to convert s2 to s1. This can be done by checking the permutation p corresponding to the transformation from s1 to s2. For example, if s1 = "abcde" and s2 = "badce", then p = (2,1,4,3,5), which means that position 1 contains element # 2, position 2 contains element # 1, etc. d. This permutation can be broken down into linear permutation cycles . Cycles in the example: (2,1) (4,3) and (5). The minimum number of swaps is the total number of swaps required for each cycle. Cycle k requires k-1 swaps to βfixβ. Therefore, the number of swaps is NC, where N is the line length and C is the number of cycles. In our example, the result is 2 (swap 1.2, and then 3.4).
Now there are two problems, and I think I'm too tired to solve them right now :)
1) My decision assumes that the symbol does not repeat, which is not always the case. For correct calculation of the amount of swap, some adjustment is necessary.
2) My formula # MinSwaps = NC needs proof ... I did not find it on the Internet.
source share