What is xml normalization?

Question

What is xml normalization?

Possible duplicate:
What does the normal Java Node method do?

What is xml normalization. I found the following in javadoc, but I can not understand it? Can anyone help?

public void normalize()

Puts all text nodes to the entire depth of the subtree under this Node, including attribute nodes, in a "normal" form, where only the structure (for example, elements, comments, processing instructions, CDATA sections and link essence) separates the text nodes, i.e. there are no adjacent text nodes or empty text nodes. This can be used to ensure that the DOM representation of the document is the same as if it were saved and reloaded, and is useful when operations (for example, searching XPointer [XPointer]) that depend on the particular structure of the document tree are used. If the "normalize-characters" parameter of the DOMConfiguration object attached to Node.ownerDocument is true, this method also completely normalizes the characters of the Text nodes. Note. In cases where the document contains CDATASections, the normalization operation alone may not be sufficient, because XPointers do not distinguish between Text nodes and CDATASection nodes. Because: DOM Level 3

+4

java xml terminology normalization

akshay Jul 14 '11 at 14:05

source share

2 answers

Ed staub · Answer 1 · 2011-07-14T14:17:50+0000

Parsers often return “awesome” text nodes, where text is split into several nodes or, more rarely, empty text nodes. This is a side effect of optimizing them for maximum performance. This can happen when there are ignorant spaces, buffer boundaries, or elsewhere that are just convenient for the parser.

normalize() gets rid of all these surprises, merging adjacent text nodes, and deleting empty ones.

Michael borgwardt · Answer 2 · 2011-07-14T14:16:48+0000

The doc API explains this in detail, not sure what to explain. Basically, the method converts the DOM subtree beginning with this node into the "standard format", combining adjacent text nodes, excluding empty text nodes and, optionally, also normalizing characters that are Unicode composites.

What is xml normalization?

More articles: