You might want to use this as a starting point:
Python 2.6.7 (r267:88850, Jun 13 2011, 22:03:32) [GCC 4.6.1 20110608 (prerelease)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import urllib2, re >>> from BeautifulSoup import BeautifulSoup >>> urllib2.urlopen('http://www.immi.gov.au/skilled/general-skilled-migration/estimated-allocation-times.htm') <addinfourl at 139158380 whose fp = <socket._fileobject object at 0x84aa2ac>> >>> html = _.read() >>> soup = BeautifulSoup(html) >>> soup.find(text = re.compile('\\bsubclass 885\\b')).parent.parent.find('td', text = re.compile(' [0-9]{4}$')) u'15 May 2011'
jcomeau_ictx
source share