How to get the internal text value of an HTML tag using BeautifulSoup bs4?

Question

How to get the internal text value of an HTML tag using BeautifulSoup bs4?

When using BeautifulSoup bs4, how to get text from an HTML tag? When I ran this line:

oname = soup.find("title")

I get the tag titleas follows:

<title>page name</title>

and now I want to get only the inner text page name,, without tags. How to do it?

+4

python html beautifulsoup

kibaya Jan 14 '15 at 1:19

source share

1 answer

Padraic cunningham · Accepted Answer · 2015-01-14T01:22:08+0000

Use .text to get text from a tag.

oname = soup.find("title")
oname.text

Or simply soup.title.text

In [4]: from bs4 import BeautifulSoup    
In [5]: import  requests
In [6]: r = requests.get("http://stackoverflow.com/questions/27934387/how-to-retrieve-information-inside-a-tag-with-python/27934403#27934387")    
In [7]: BeautifulSoup(r.content).title.text
Out[7]: u'html - How to Retrieve information inside a tag with python - Stack Overflow'

To open a file and use text as a name, just use it like any other line:

with open(oname.text, 'w') as f

How to get the internal text value of an HTML tag using BeautifulSoup bs4?

More articles: