I use re.findall () to extract some version numbers from an HTML file:
>>> import re >>> text = "<table><td><a href=\"url\">Test0.2.1.zip</a></td><td>Test0.2.1</td></table> Test0.2.1" >>> re.findall("Test([\.0-9]*)", text) ['0.2.1.', '0.2.1', '0.2.1']
but I would like to get only those that do not end with a period. The file name may not always be .zip, so I cannot just insert .zip into the regex.
I want to finish:
['0.2.1', '0.2.1']
Can anyone suggest a better regex to use? :)
python regex findall
Ashy
source share