Aggregation Algorithm for Similar Sequences

Say you have a list of similar sequences, for example

aaaa abaaa xaaaay ... 

You want to discover a common set of all these sequences, for example

 x? ab? aaay? 

where is the operator ? indicates that the item is optional.

Which algorithm would you use?

+4
source share
3 answers

Look at the sequence alignment algorithms used in bioinformatics.

More specifically, since you have a list, multiple sequence alignment . Viterbi's algorithm should do.

+3
source

I think that if you convert your list to a suffix tree, then this will be a very simple recursive solution, but I'm not sure about the asymptotic complexity

+1
source

You might want to check out the smith-Waterman algorithm, which is used to perform sequence alignments.

+1
source

All Articles