I have been given a URL and I want to extract the contents of the tag <BODY>from the URL. I am using Python3. I stumbled upon sgmllib, but it is not available for Python3.
Can someone help me with this? Can I use HTMLParserfor this?
Here is what I tried:
import urllib.request
f=urllib.request.urlopen("URL")
s=f.read()
from html.parser import HTMLParser
class MyHTMLParser(HTMLParser):
def handle_data(self, data):
print("Encountered some data:", data)
parser = MyHTMLParser()
parser.feed(s)
this gives me an error: TypeError: Unable to convert the 'bytes' object to str implicitly
source
share