Parse HTML with a beautiful soup. Return text from a specific tag

I can parse the full argument of the html tag addressing it through a unix script shell as follows:

# !/usr/bin/python3

# import the module
from bs4 import BeautifulSoup

# define your object
soup = BeautifulSoup(open("test.html"))

# get the tag
print(soup(itemprop="name"))

where itemprop="name"uniquely identifies the desired tag.

the conclusion is similar to

[<span itemprop="name">
                    Blabla &amp; Bloblo</span>]

Now I would like to return only a part Bla Bla Blo Blo.

my attempt was to do:

print(soup(itemprop="name").getText())

but I get an error like AttributeError: 'ResultSet' object has no attribute 'getText'

he worked experimentally in other contexts such as

print(soup.find('span').getText())

So am I mistaken?

+4
source share
1 answer

Using the object soupas the called returns a list of results, as if you were using soup.find_all(). See Documentation:

find_all() - API Beautiful Soup, . BeautifulSoup Tag, , , find_all() .

soup.find(), :

soup.find(itemprop="name").get_text()

:

soup(itemprop="name")[0].get_text()
+7

All Articles