XML comments at the beginning of the document

my PYTHON xml parser fails if there is a comment at the beginning of the xml file, for example:

<?xml version="1.0" encoding="utf-8"?> <!-- Script version: "1"--> <!-- Date: "07052010"--> <component name="abc"> <pp> .... </pp> </component> 

Is it wrong to post such a comment?

EDIT:

it’s good that it does not throw an error, but the DOM module will fail and will not recognize the child nodes:

 import xml.dom.minidom as dom sub_tree = dom.parse('xyz.xml') for component in sub_tree.firstChild.childNodes: print(component) 

I can not connect child nodes; sub_tree.firstChild.childNodes returns an empty list, but if I delete these 2 comments, I can scroll through the list and read them as usual!

EDIT:

Guys, this simple example works and is enough to understand this. run your python shell and execute this little code above. As soon as it returns nothing and after deleting the comments the node message will appear!

+4
source share
4 answers

If you do this:

 import xml.dom.minidom as dom sub_tree = dom.parse('xyz.xml') print sub_tree.children 

You will see what the problem is:

 >>> print sub_tree.childNodes [<DOM Comment node " Script ve...">, <DOM Comment node " Date: "07...">, <DOM Element: component at 0x7fecf88c>] 

firstChild will obviously take the first child, which is a comment and has no children of its own. You can iterate over children and skip all comment nodes.

Or you can drop the DOM model and use the ElementTree , which is much better to work with. :)

+1
source

It is legal; from XML 1.0 Link :

2.5 Comments

[Definition: comments may appear anywhere in the document outside of another markup; in addition, they may appear in a document type declaration in places permitted by the grammar. They are not part of the document character data; an XML processor MAY, but not necessarily, an application to extract text Comments. To ensure compatibility, the string β€œ-” (double hyphen) MUST NOT be found in the comments.] The entity link parameter MUST NOT be recognized in the comments.

+1
source

To get better answers, show us (a) a small, full Python script, and (b) a small, full XML document that together shows unexpected behavior.

Have you considered using ElementTree?

+1
source

It should be legitimate if the XML declaration is the first line.

0
source

Source: https://habr.com/ru/post/1312542/


All Articles