Python "regex" module: blur value

I use the "fuzzy matching" functionality of the Regex module.

How can I get a “fuzzy” value in a “match” that indicates how different the template is from the string, like “edit distance” in Levenshtein?

I thought I could get the value in the Match object, but it is not. Official documents did not say anything about this.

eg:.

regex.match('(?:foo){e}','for') 

a.captures() tells me that the word "for" matches, but I would like to know the value of fuzziness, which should be 1 in this case.

Is there any way to achieve this?

+4
source share
2 answers
 >>> import difflib >>> matcher = difflib.SequenceMatcher(None, 'foo', 'for') >>> sum(size for start, end, size in matcher.get_matching_blocks()) 2 >>> max(map(len, ('foo', 'for'))) - _ 1 >>> >>> >>> matcher = difflib.SequenceMatcher(None, 'foo', 'food') >>> sum(size for start, end, size in matcher.get_matching_blocks()) 3 >>> max(map(len, ('foo', 'food'))) - _ 1 

http://docs.python.org/2/library/difflib.html#difflib.SequenceMatcher.get_matching_blocks http://docs.python.org/2/library/difflib.html#difflib.SequenceMatcher.get_opcodes

+1
source
 a = regex.match('(?:foo){e}','for') a.fuzzy_counts 

this returns a tuple (x, y, z), where:

x = number of substitutions

y = number of inserts and

z = number of deletions

But this is not always a reliable account, i.e. a regular nightly meeting of a regular expression does not equal the true Leinstein distance in some cases

Fuzzy match with Python regex module: number of substitutions does not meet expectations

0
source

All Articles