I wrote a function to return a generator containing each unique combination of substrings, a given length, containing more than n elements from the primary string.
By way of illustration:
if I have "abcdefghi" and a probe two in length, and a threshold of 4 elements in the list, which I would like to get:
['ab', 'cd', 'ef', 'gh'] ['ab', 'de', 'fg', 'hi'] ['bc', 'de', 'fg', 'hi']
My first attempt at this problem was to return a list of lists. This ended with an overflow of computer memory. As a raw secondary solution, I created a generator that does something similar. The problem is that I created a nested generator that calls itself. When I run this function, it seems that it just loops on the inner loop without calling itself again. I thought the generator would precede as far away from the recursive hole as necessary until it hits the yield statement. Does it make sense what is going on?
def get_next_probe(self, current_probe_list, probes, unit_length): if isinstance(current_probe_list, list): last_probe=current_probe_list[-1] available_probes = [candidate for candidate in probes if candidate.start>last_probe.end] else: available_probes = [candidate for candidate in probes if candidate.start<unit_length] if available_probes: max_position=min([probe.end for probe in available_probes]) available_probes2=[probe for probe in available_probes if max_position+1>probe.start] for new_last_probe in available_probes2: new_list=list(current_probe_list) new_list.append(new_last_probe) self.get_next_probe(new_list, probes, unit_length) else: if len(current_probe_list)>=self.num_units: yield current_probe_list
If the output is changed for printing, this works fine! I would be grateful for any help I could receive. I understand that this is not an optimal implementation of this type of search problem, it seems that returning a list of found items from the last call to get_next_probe and filtering this list for elements that do not overlap new_last_probe.end would be much more effective ... but it was much easier for me write. Any input to the algorithm will still be appreciated.
Thanks!
python generator recursion bioinformatics
user1348971
source share