Find_all with camelCase tag names with BeautifulSoup 4

I am trying to clear an xml file using BeautifulSoup 4.4.0 with tag names in camelCase, and find_all cannot seem to find them. Code example:

from bs4 import BeautifulSoup xml = """ <hello> world </hello> """ soup = BeautifulSoup(xml, "lxml") for x in soup.find_all("hello"): print x xml2 = """ <helloWorld> :-) </helloWorld> """ soup = BeautifulSoup(xml2, "lxml") for x in soup.find_all("helloWorld"): print x 

The output I get is:

 $ python soup_test.py <hello> world </hello> 

What is the correct way to search for camel / uppercase tag names?

+7
python beautifulsoup
source share
1 answer

For any case-sensitive parsing using BeautifulSoup, you need to parse in "xml" mode. The default mode (HTML parsing) does not care about the case, since HTML does not care about the case. In your case, instead of the "lxml" mode, switch it to "xml" :

 from bs4 import BeautifulSoup xml2 = """ <helloWorld> :-) </helloWorld> """ soup = BeautifulSoup(xml2, "xml") for x in soup.find_all("helloWorld"): print x 
+6
source share

All Articles