I wrote a tiny html parser in Python using lxml. This is very useful, but I have a problem.
I have the following code:
tags = doc.xpath('//table//tr/td[@align="right"]/b') for tag in tags: print(x.text.strip())
It works great. But if there is a <br> tag in the <b> element, for example:
<b> first-half <br> second-half </b>
this code will only print first-half in the <b> .
How can I get all the text in <b> even if there is a <br> tag?
Thanks.
source share