Python regex numbers and search points

Question

Python regex numbers and search points

I use re.findall () to extract some version numbers from an HTML file:

>>> import re >>> text = "<table><td><a href=\"url\">Test0.2.1.zip</a></td><td>Test0.2.1</td></table> Test0.2.1" >>> re.findall("Test([\.0-9]*)", text) ['0.2.1.', '0.2.1', '0.2.1']

but I would like to get only those that do not end with a period. The file name may not always be .zip, so I cannot just insert .zip into the regex.

I want to finish:

 ['0.2.1', '0.2.1']

Can anyone suggest a better regex to use? :)

+6

python regex findall

Ashy Dec 10 '08 at 15:33

source share

1 answer

Tomalak · Accepted Answer · 2008-12-10T15:36:07+0000

 re.findall(r"Test([0-9.]*[0-9]+)", text)

or, a little shorter:

 re.findall(r"Test([\d.]*\d+)", text)

By the way, you should not shy away from a point in the character class:

 [\.0-9] // matches: 0 1 2 3 4 5 6 7 8 9 . \ [.0-9] // matches: 0 1 2 3 4 5 6 7 8 9 .

Python regex numbers and search points

More articles: