Trees Phylo BioPython Building

I am trying to build a tree using BioPython, the Phylo module.
What I have done so far is this image: alt text

each name has a four-digit number followed by a number: this number refers to the number of times this sequence is presented. This means 1578 - 22 that node should represent 22 sequences.

aligned sequence file : file
file with distance to build a tree: file

So now I know how to resize each node size. Each node has a different size, this easily makes an array of different values:

fh = open(MEDIA_ROOT + "groupsnp.txt") list_size = {} for line in fh: if '>' in line: labels = line.split('>') label = labels[-1] label = label.split() num = line.split('-') size = num[-1] size = size.split() for lab in label: for number in size: list_size[lab] = int(number) a = array(list_size.values()) 

But the array is arbitrary, I would like to put the correct node size to the right of the node, I tried this:

  for elem in list_size.keys(): if labels == elem: Phylo.draw_graphviz(tree_xml, prog="neato", node_size=a) 

but nothing appears when I use the if statement.

Anyway, can this be done?

I would be very grateful!

Thanks everyone

+6
python numpy graphviz biopython
source share
1 answer

I finally got this job. The basic premise is that you are going to use labels/nodelist to create node_sizes . Thus, they correlate correctly. I am sure that I am missing some important parameters to make the tree look 100%, but it seems that the node sizes are displayed correctly.

 #basically a stripped down rewrite of Phylo.draw_graphviz import networkx, pylab from Bio import Phylo #taken from draw_graphviz def get_label_mapping(G, selection): for node in G.nodes(): if (selection is None) or (node in selection): try: label = str(node) if label not in (None, node.__class__.__name__): yield (node, label) except (LookupError, AttributeError, ValueError): pass kwargs={} tree = Phylo.read('tree.dnd', 'newick') G = Phylo.to_networkx(tree) Gi = networkx.convert_node_labels_to_integers(G, discard_old_labels=False) node_sizes = [] labels = dict(get_label_mapping(G, None)) kwargs['nodelist'] = labels.keys() #create our node sizes based on our labels because the labels are used for the node_list #this way they should be correct for label in labels.keys(): if str(label) != "Clade": num = label.name.split('-') #the times 50 is just a guess on what would look best size = int(num[-1]) * 50 node_sizes.append(size) kwargs['node_size'] = node_sizes posi = networkx.pygraphviz_layout(Gi, 'neato', args='') posn = dict((n, posi[Gi.node_labels[n]]) for n in G) networkx.draw(G, posn, labels=labels, node_color='#c0deff', **kwargs) pylab.show() 

Resulting tree alt text

+8
source share

All Articles