I am adapting the following code (created using the advice of this question ) that took an XML file and its DTD and converted them to another format. For this task, only the loading section is important:
xmldoc = open(filename) parser = etree.XMLParser(dtd_validation=True, load_dtd=True) tree = etree.parse(xmldoc, parser)
This worked fine, although it used a file system, but I convert it to run through a web framework where two files are uploaded via a form.
Downloading an XML file works fine:
tree = etree.parse(StringIO(data['xml_file'])
But since the DTD is associated with the top of the xml file, the following statement does not work:
parser = etree.XMLParser(dtd_validation=True, load_dtd=True) tree = etree.parse(StringIO(data['xml_file'], parser)
Through this question , I tried:
etree.DTD(StringIO(data['dtd_file']) tree = etree.parse(StringIO(data['xml_file'])
While the first line does not cause an error, the second falls on the unicode objects that the DTD is intended for selection (and does this in the file system version):
XMLSyntaxError: Entity 'eacute' not row 4495, column 46
How to load this DTD?
source share