Subclass ElementTree parser to save comments

Question

Subclass ElementTree parser to save comments

Trying to use ElementTree to parse XML files; since the analyzer does not save comments by default, the following code from http://bugs.python.org/issue8277 is used :

import xml.etree.ElementTree as etree

class CommentedTreeBuilder(etree.TreeBuilder):
    """A TreeBuilder subclass that retains comments."""

    def comment(self, data):
        self.start(etree.Comment, {})
        self.data(data)
        self.end(etree.Comment)

parser = etree.XMLParser(target = CommentedTreeBuilder())

The above is in document.py. Tested with:

class TestDocument(unittest.TestCase):

    def setUp(self):
        filename = os.path.join(sys.path[0], "data", "facilities.xml")
        self.doc = etree.parse(filename, parser = documents.parser)

    def testClass(self):
        print("Class is {0}.".format(self.doc.__class__.__name__))
        #commented out tests.

if __name__ == '__main__':
    unittest.main()

These are barfs:

Traceback (most recent call last):
File "/home/goncalo/documents/games/ja2/modding/mods/xml-overhaul/src/scripts/../tests/test_documents.py", line 24, in setUp
    self.doc = etree.parse(filename, parser = documents.parser)
File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 1242, in parse
    tree.parse(source, parser)
File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 1726, in parse
    parser.feed(data)
IndexError: pop from empty stack

What am I doing wrong? By the way, the xml in the file is valid (as verified by an independent program) and in utf-8 encoding.

note (s):

using Python 3.3. In Kubuntu 13.04, just in case, this is true. I will definitely use "python3" (and not just "python") to run test scripts.

edit: here is an example of the xml file used; it is very small (let's see if I can format it correctly):

<?xml version="1.0" encoding="utf-8"?>
<!-- changes to facilities.xml by G. Rodrigues: ar overhaul.-->
<SECTORFACILITIES>
    <!-- Drassen -->
    <!-- Small airport -->
    <FACILITY>
        <SectorGrid>B13</SectorGrid>
        <FacilityType>4</FacilityType>
        <ubHidden>0</ubHidden>
    </FACILITY>
</SECTORFACILITIES>

+4

python xml

G. Rodrigues 13 . '13 19:46

1

Lukas Graf · Accepted Answer · 2013-12-13T21:06:07+0000

XML, , 2.7, 3,3 .

, -, - XML . 2.7 ( ), 3.3.

. Python # 17901: Python 3.4, , pop from empty stack , ParseError: multiple elements on top level .

: , . XML node , "" ( ).

, , , : node XML - , .

Subclass ElementTree parser to save comments

More articles: