I am trying to parse text between a <blockquote> . When I type soup.blockquote.get_text() .
I get the result that I want for the first counter blockquote in the HTML file. How to find the next and consecutive <blockquote> in a file? Maybe I'm just tired and can't find it in the documentation.
Example HTML file:
<html> <head>header </head> <blockquote>I can get this text </blockquote> <p>eiaoiefj</p> <blockquote>trying to capture this next </blockquote> <p></p><strong>do not capture this</strong> <blockquote> capture this too but separately after "capture this next" </blockquote> </html>
simple python code:
from bs4 import BeautifulSoup html_doc = open("example.html") soup = BeautifulSoup(html_doc) print.(soup.blockquote.get_text()) # how to get the next blockquote???
PSeUdocode
source share