yesnoIs there a way to get th...">

Python parsing / extracting table data

<html> <table border="1px"> <tr> <td>yes</td> <td>no</td> </tr> </table> </html> 

Is there a way to get the contents of the table (yes, no) other than beautifulsoup ??

Beginner python, any help or any direction will be very useful.

thanks

+4
source share
1 answer

You can use the HTMLParser module that comes with the standard Python library.

 >>> import HTMLParser >>> data = ''' ... <html> ... <table border="1px"> ... <tr> ... <td>yes</td> ... <td>no</td> ... </tr> ... </table> ... </html> ... ''' >>> class TableParser(HTMLParser.HTMLParser): ... def __init__(self): ... HTMLParser.HTMLParser.__init__(self) ... self.in_td = False ... ... def handle_starttag(self, tag, attrs): ... if tag == 'td': ... self.in_td = True ... ... def handle_data(self, data): ... if self.in_td: ... print data ... ... def handle_endtag(self, tag): ... self.in_td = False ... >>> p = TableParser() >>> p.feed(data) yes no 
+8
source

All Articles