I need a regular expression that will return me the text contained between double quotes, which starts with the specified text block and ends with a specific file extension (e.g. .txt). I use urllib2 to get the html page (html is pretty simple).
Basically, if I have something like
<tr> <td valign="top"><img src="/icons/unknown.gif" alt="[ ]"></td> <td><a href="Client-8.txt">new_Client-8.txt</a></td> <td align="right">27-Jun-2012 18:02 </td> </tr>
He should just come back to me.
Client-8.txt
If the return value is contained in double quotes. I know how the file name starts with "Client-" and the file extension is ".txt".
I play with r.search (regex, string) where the input string is the html of the page. But I stink with regular expressions.
Thanks!
source share