Parse Ampersand in XML with Java DOM XML API

I am trying to parse an XML document with a Java DOM (not SAX) API. Whenever the parser encounters an ampersand (&) while parsing node text, it throws an error. I suppose this is resolvable: 1) escaping, 2) encoding, or 3) using another parser.

I am reading an XML document that I have no control over, so I can’t pinpoint where the ampersand appears in the document every time I read it.

The answers I saw to similar questions recommended replacing the entity type when parsing XML, but I'm not sure how I can do this, since it does not even parse when it encounters an XML ampersand.

Any help would be appreciated.

+5
source share
2 answers

As noted, XML has the wrong format (oops!): All occurrences &in XML (except for the token entering the character object [?]) Must be encoded as &.

Some solutions (which are basically also described in the post!):

  • Fix XML (in source or hacking phase) or;
  • Disassemble it using a “suitable” tool (for example, a “forgiving” HTML parser)

"" - . - DOM: , & ( ), " ", & . , XML ...

.

+3

" XML-, ".

, -XML-. , , , XML , -, XML.

XML , XML . , , . XML , .

+2

All Articles