Python - how to find all intersections of two lines?

How do I find all the intersections (also called the longest regular substrings) of two lines and their positions in both lines?

For example, if S1="never"and S2="forever", then the given intersection should be ["ever"], and its position [(1,3)]. If S1="address"and S2="oddness", then the given intersections ["dd","ess"], and their positions [(1,1),(4,4)].

The shortest solution without including any library is preferred. But any correct solution is also welcome.

+5
source share
6 answers

, , - . , Python difflib , , . , Python, difflib , .

In [31]: import difflib

In [32]: difflib.SequenceMatcher(None, "never", "forever").get_matching_blocks()
Out[32]: [Match(a=1, b=3, size=4), Match(a=5, b=7, size=0)]


In [33]: difflib.SequenceMatcher(None, "address", "oddness").get_matching_blocks()
Out[33]: [Match(a=1, b=1, size=2), Match(a=4, b=4, size=3), Match(a=7, b=7, size=0)]

Match, ( ).

+12

O (n + m), n m - .

:

function LCSubstr(S[1..m], T[1..n])
    L := array(1..m, 1..n)
    z := 0
    ret := {}
    for i := 1..m
        for j := 1..n
            if S[i] = T[j]
                if i = 1 or j = 1
                    L[i,j] := 1
                else
                    L[i,j] := L[i-1,j-1] + 1
                if L[i,j] > z
                    z := L[i,j]
                    ret := {}
                if L[i,j] = z
                    ret := ret ∪ {S[i-z+1..z]}
    return ret

. Longest_common_substring_problem wikipedia.

+7

:

import itertools

def longest_common_substring(s1, s2):
   set1 = set(s1[begin:end] for (begin, end) in
              itertools.combinations(range(len(s1)+1), 2))
   set2 = set(s2[begin:end] for (begin, end) in
              itertools.combinations(range(len(s2)+1), 2))
   common = set1.intersection(set2)
   maximal = [com for com in common
              if sum((s.find(com) for s in common)) == -1 * (len(common)-1)]
   return [(s, s1.index(s), s2.index(s)) for s in maximal]

:

>>> longest_common_substring('address', 'oddness')
[('dd', 1, 1), ('ess', 4, 4)]
>>> longest_common_substring('never', 'forever')
[('ever', 1, 3)]
>>> longest_common_substring('call', 'wall')
[('all', 1, 1)]
>>> longest_common_substring('abcd1234', '1234abcd')
[('abcd', 0, 4), ('1234', 4, 0)]
+5

!

difflib - diff:

>>> import difflib
>>> list(difflib.ndiff("never","forever"))
['- n', '+ f', '+ o', '+ r', '  e', '  v', '  e', '  r']
>>> diffs = list(difflib.ndiff("never","forever"))
>>> for d in diffs:
...   print {' ': '  ', '-':'', '+':'    '}[d[0]]+d[1:]
...
 n
     f
     o
     r
   e
   v
   e
   r
+4

I assume that you want the substrings to match if they have the same absolute position in their respective strings. For example, "abcd" and "bcde" will not match, although both contain "bcd".

a = "address"
b = "oddness"

#matches[x] is True if a[x] == b[x]
matches = map(lambda x: x[0] == x[1], zip(list(a), list(b)))

positions = filter(lambda x: matches[x], range(len(a)))
substrings = filter(lambda x: x.find("_") == -1 and x != "","".join(map(lambda x: ["_", a[x]][matches[x]], range(len(a)))).split("_"))

position = [1, 2, 4, 5, 6]

substrings = ['dd', 'ess']

If you only need substrings, you can grind it into one line:

filter(lambda x: x.find("_") == -1 and x != "","".join(map(lambda x: ["_", a[x]][map(lambda x: x[0] == x[1], zip(list(a), list(b)))[x]], range(len(a)))).split("_"))
+1
source
def  IntersectStrings( first,  second):
x = list(first)
#print x
y = list(second)
lst1= []
lst2= []
for i in x:
    if i in y:
        lst1.append(i)
lst2 = sorted(lst1) + []
   # This  above step is an optional if it is required to be sorted      alphabetically use this or else remove it
return ''.join(lst2)

print IntersectStrings('hello','mello' )
0
source

All Articles