ABC 123 abc

Get text using the BeautifulSoup CSS selector

HTML example

<h2 id="name"> ABC <span class="numbers">123</span> <span class="lower">abc</span> </h2> 

I can get numbers with something like:

 soup.select('#name > span.numbers')[0].text 

How to get ABC text using BeautifulSoup and select function?

What in this case?

 <div id="name"> <div id="numbers">123</div> ABC </div> 
+6
source share
1 answer

In the first case, get the previous sibling :

 soup.select_one('#name > span.numbers').previous_sibling 

In the second case, get the following sibling :

 soup.select_one('#name > #numbers').next_sibling 

Note that I assume that you have numbers as the id value, and the div tag instead of span . Therefore, I adjusted the CSS selector.


To cover both cases, you can go to the parent tag and find the non-empty node text in non-recursive mode:

 parent = soup.select_one('#name > .numbers,#numbers').parent print(parent.find(text=lambda text: text and text.strip(), recursive=False).strip()) 

Pay attention to the change in the selector - we ask you to map the class numbers id or numbers .

Although, I feel that this universal solution will not be reliable enough, because for a start I do not know what your real source data is.

+6
source

All Articles