Change the distance between two regular expressions

I came across this question in an interview:

Given the two regular expressions, calculate the editing distance between them. The edit distance is defined as the smallest edit distance between any two lines generated by two regular expressions, respectively.

Formally, we look for d(L1,L2) = min { d(x,y) | x from L1, y from L2 } d(L1,L2) = min { d(x,y) | x from L1, y from L2 } , where L1 and L2 are the languages โ€‹โ€‹generated by two regular expressions.

I could not solve it during the interviews. Even now I do not know how to solve this. Any ideas?

I think this is the same as http://www.spoj.com/problems/AMR10B/

+7
string algorithm regex dynamic-programming
source share
1 answer

There are finite state machines, which are two languages. Let them say that the first language has states S [1], S [2], ..., S [N1] and transitions c: S [i] โ†’ S [j] (which means that state i goes into state j under the input symbol c) and T [1], T [2], ... T [N2] for the second language (with its own set of transitions).

Now you can build weighted multigraphite with nodes that are state pairs and edges between pairs (S [i1], T [i2]) โ†’ (S [j1], T [j2]), if any of these four cases:

  • Here c: S [i1] โ†’ S [j1] and i2 = j2. It has a weight of 1
  • There c: T [i2] โ†’ T [j2] and i1 = j1. It has a weight of 1
  • Here c: S [i1] โ†’ S [j1] and c: T [i2] โ†’ T [j2]. It has a weight of 0
  • There c: S [i1] โ†’ S [j1] and d: T [i2] โ†’ T [j2]. It has a weight of 1

Then, finding the smallest weight path from a pair of start states to any pair of receiving states gives the minimum editing distance.

+3
source share

All Articles