London
York
...">

Beautiful soup getting the first baby

How can I get my first child?

<div class="cities"> <div id="3232"> London </div> <div id="131"> York </div> </div> 

How can I get London?

 for div in nsoup.find_all(class_='cities'): print (div.children.contents) 

AttributeError: object 'listiterator' does not have attribute 'contents'

+11
source share
3 answers

div.children returns an iterator.

 for div in nsoup.find_all(class_='cities'): for childdiv in div.find_all('div'): print (childdiv.string) #london, york 

The AttributeError attribute was raised because no tags, such as '\n' , are in .children . just use the right child selector to find the specific div.

(more rights) cannot reproduce your exceptions - this is what I did:

 In [137]: print foo.prettify() <div class="cities"> <div id="3232"> London </div> <div id="131"> York </div> </div> In [138]: for div in foo.find_all(class_ = 'cities'): .....: for childdiv in div.find_all('div'): .....: print childdiv.string .....: London York In [139]: for div in foo.find_all(class_ = 'cities'): .....: for childdiv in div.find_all('div'): .....: print childdiv.string, childdiv['id'] .....: London 3232 York 131 
+6
source

In modern versions of bs4 (of course, bs4 4.7. 1+) you have access to the css pseudo-selector from the first child. Nice and descriptive.

 from bs4 import BeautifulSoup as bs html = ''' <div class="cities"> <div id="3232"> London </div> <div id="131"> York </div> </div> ''' soup = bs(html, 'lxml') #or 'html.parser' first_children = [i.text for i in soup.select('.cities div:first-child')] print(first_children) 
+2
source

All cities receive the current accepted answer when the question was asked only first.

If you only need your first child, you can use .children by returning an iterator rather than a list. Remember that the iterator generates list items on the fly, and since we only need the first element of the iterator, we never need to generate all the other elements of the city (thus saving time).

 for div in nsoup.find_all(class_='cities'): first_child = next(div.children, None) if first_child is not None: print(first_child.string.strip()) 
0
source

All Articles