Beautiful soup getting the first baby
div.children returns an iterator.
for div in nsoup.find_all(class_='cities'): for childdiv in div.find_all('div'): print (childdiv.string) #london, york The AttributeError attribute was raised because no tags, such as '\n' , are in .children . just use the right child selector to find the specific div.
(more rights) cannot reproduce your exceptions - this is what I did:
In [137]: print foo.prettify() <div class="cities"> <div id="3232"> London </div> <div id="131"> York </div> </div> In [138]: for div in foo.find_all(class_ = 'cities'): .....: for childdiv in div.find_all('div'): .....: print childdiv.string .....: London York In [139]: for div in foo.find_all(class_ = 'cities'): .....: for childdiv in div.find_all('div'): .....: print childdiv.string, childdiv['id'] .....: London 3232 York 131 In modern versions of bs4 (of course, bs4 4.7. 1+) you have access to the css pseudo-selector from the first child. Nice and descriptive.
from bs4 import BeautifulSoup as bs html = ''' <div class="cities"> <div id="3232"> London </div> <div id="131"> York </div> </div> ''' soup = bs(html, 'lxml') #or 'html.parser' first_children = [i.text for i in soup.select('.cities div:first-child')] print(first_children) All cities receive the current accepted answer when the question was asked only first.
If you only need your first child, you can use .children by returning an iterator rather than a list. Remember that the iterator generates list items on the fly, and since we only need the first element of the iterator, we never need to generate all the other elements of the city (thus saving time).
for div in nsoup.find_all(class_='cities'): first_child = next(div.children, None) if first_child is not None: print(first_child.string.strip())