Manually creating a NodeList node for all document nodes

Question

Manually creating a NodeList node for all document nodes

I am currently creating a NodeList all document nodes (in document order) manually. XPath expression to get this NodeList -

 //. | //@* | //namespace::*

My first attempt to manually execute the DOM and assemble the nodes ( NodeSet is a primitive implementation of NodeList delegating List ):

 private static void walkRecursive(Node cur, NodeSet nodes) { nodes.add(cur); if (cur.hasAttributes()) { NamedNodeMap attrs = cur.getAttributes(); for (int i=0; i < attrs.getLength(); i++) { Node child = attrs.item(i); walkRecursive(child, nodes); } } int type = cur.getNodeType(); if (type == Node.ELEMENT_NODE || type == Node.DOCUMENT_NODE) { NodeList children = cur.getChildNodes(); if (children == null) return; for (int i=0; i < children.getLength(); i++) { Node child = children.item(i); walkRecursive(child, list); } } }

I would start the recursion by calling walkRecursive(doc, nodes) , where doc is org.w3c.Document and nodes a (but empty) NodeSet .

I tested this using this primitive XML document:

 <?xml version="1.0"?> <myns:root xmlns:myns="http://www.my.ns/#"> <myns:element/> </myns:root>

If, for example, I canonicalize a manually created NodeSet and NodeList generated by the XPath expression originally mentioned and compare two bytes for a byte, then the result will be equal and seems to work fine.

But , if I repeat two NodeList and print out debugging information ( typeString just generates a string representation)

 for (int i=0; i < nodes.getLength(); i++) { Node child = nodes.item(i); System.out.println("Type: " + typeString(child.getNodeType()) + " Name:" + child.getNodeName() + " Local name: " + child.getLocalName() + " NS: " + child.getNamespaceURI()); }

then I get this output for the generated XPath NodeList :

 Type: DocumentNode Name:#document Local name: null NS: null Type: Element Name:myns:root Local name: root NS: http://www.my.ns/# Type: Attribute Name:xmlns:myns Local name: myns NS: http://www.w3.org/2000/xmlns/ Type: Attribute Name:xmlns:xml Local name: xml NS: http://www.w3.org/2000/xmlns/ Type: Text Name:#text Local name: null NS: null Type: Element Name:myns:element Local name: element NS: http://www.my.ns/# Type: Text Name:#text Local name: null NS: null

and this is for a manually created NodeList :

 Type: DocumentNode Name:#document Local name: null NS: null Type: Element Name:myns:root Local name: root NS: http://www.my.ns/# Type: Attribute Name:xmlns:myns Local name: myns NS: http://www.w3.org/2000/xmlns/ Type: Text Name:#text Local name: null NS: null Type: Element Name:myns:element Local name: element NS: http://www.my.ns/# Type: Text Name:#text Local name: null NS: null

So, as you can see, in the first example, the NodeList additionally contains a Node for the XML namespace:

 Type: Attribute Name:xmlns:xml Local name: xml NS: http://www.w3.org/2000/xmlns/

Now my questions are:

a) If I interpret xml-names11 correctly , I do not need an xmlns: xml declaration:

The xml prefix is by definition associated with the namespace name http://www.w3.org/XML/1998/namespace . It MAY, but not be, be declared and MUST NOT be undeclared or bound to any other namespace name. Other prefixes MUST NOT be bound to this namespace name and MUST NOT be declared as the default namespace.

Am I right? (at least c) hints in that direction)

b) But then why does the XPath score add it anyway - shouldn't it just include what was in the first place, instead of automatically adding things?

c) This can cause problems with XML canonicalization , although it shouldn’t be that xml namespace declarations should be omitted during canonicalization. Does anyone know of (Java implementations) that do this wrong?

Edit:

Here is the code I used to compute the XPath expression containing the namespace "xml" node:

 DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); dbf.setNamespaceAware(true); dbf.setValidating(false); InputStream in = ...; try { Document doc = dbf.newDocumentBuilder().parse(in); XPathFactory fac = XPathFactory.newInstance(); XPath xp = fac.newXPath(); XPathExpression exp = xp.compile("//. | //@* | //namespace::*"); NodeList nodes = (NodeList)exp.evaluate(doc, XPathConstants.NODESET); } finally { in.close(); }

+4

java dom xml xpath canonicalization

emboss Aug 9 '11 at 1:55

source share

1 answer

forty-two · Accepted Answer · 2011-08-24T21:45:47+0000

Since you can write

 <myns:root xml:space="preserve" xmlns:myns="http://www.my.ns/#"> <myns:element/> </myns:root>

without declaring the "xml" prefix, then it must be implicit. Therefore, it is correct to include the node namespace for this namespace declaration at the stage //namespace:* location

So,

a) you are wrong, you need it (well, depending on the purpose of your code)

b) see above

c) no, but I saw other angular namespace cases where things went unstable (e.g. Problem with converting org.dom4j.Document to org.w3c.dom.Document and XML Signature

Manually creating a NodeList node for all document nodes

More articles: