I submit a GET request to CareerBuilder API :
import requests url = "http://api.careerbuilder.com/v1/jobsearch" payload = {'DeveloperKey': 'MY_DEVLOPER_KEY', 'JobTitle': 'Biologist'} r = requests.get(url, params=payload) xml = r.text
And return the XML that looks like this . However, it is difficult for me to make it out.
Using lxml
>>> from lxml import etree >>> print etree.fromstring(xml) Traceback (most recent call last): File "<pyshell#4>", line 1, in <module> print etree.fromstring(xml) File "lxml.etree.pyx", line 2992, in lxml.etree.fromstring (src\lxml\lxml.etree.c:62311) File "parser.pxi", line 1585, in lxml.etree._parseMemoryDocument (src\lxml\lxml.etree.c:91625) ValueError: Unicode strings with encoding declaration are not supported.
or ElementTree:
Traceback (most recent call last): File "<pyshell#3>", line 1, in <module> print ET.fromstring(xml) File "C:\Python27\lib\xml\etree\ElementTree.py", line 1301, in XML parser.feed(text) File "C:\Python27\lib\xml\etree\ElementTree.py", line 1641, in feed self._parser.Parse(data, 0) UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 3717: ordinal not in range(128)
So, although the XML file starts with
<?xml version="1.0" encoding="UTF-8"?>
I get the impression that it contains characters that are not allowed. How to lxml
this file using lxml
or ElementTree
?
source share