Perform simple math on regex output? (Python)

Is it possible to do simple math on exiting Python regular expressions?

I have a large file where I need to divide the numbers following ")" by 100. For example, I would convert the following line containing )75 and )2 :

 ((words:0.23)75:0.55(morewords:0.1)2:0.55); 

to )0.75 and )0.02 :

 ((words:0.23)0.75:0.55(morewords:0.1)0.02:0.55); 

My first thought was to use re.sub using the search expression "\)\d+" , but I donโ€™t know how to divide the integer following the parenthesis by 100, or if it is even possible with re .

Any thoughts on how to solve this? Thank you for your help!

+7
source share
2 answers

You can do this by providing a function as a replacement:

 s = "((words:0.23)75:0.55(morewords:0.1)2:0.55);" s = re.sub("\)(\d+)", lambda m: ")" + str(float(m.groups()[0]) / 100), s) print s # ((words:0.23)0.75:0.55(morewords:0.1)0.02:0.55); 

By the way, if you want to do this using the BioPython Newick tree parser , it will look like this:

 from Bio import Phylo # assuming you want to read from a string rather than a file from StringIO import StringIO tree = Phylo.read(StringIO(s), "newick") for c in tree.get_nonterminals(): if c.confidence != None: c.confidence = c.confidence / 100 print tree.format("newick") 

(while this particular operation takes up more lines than the regular expression version, other tree operations can be greatly simplified).

+13
source

The replacement expression for re.sub may be a function. Write a function that takes consistent text, converts it to a number, divides it by 100, and then returns the string form of the result.

+1
source

All Articles