Jaxb unmarshal processed by processed xml when using default sax parser?

So, in my current project, I'm using JAXB RI with the default Java parser from the Sun JRE (which I believe is Xerces) to decouple arbitrary XML.

First, I use XJC to compile an XSD of the following form:

<?xml version="1.0" encoding="utf-8" ?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="foobar"> ... </xs:element> </xs:schema> 

In the "good case" everything works as it was designed. That is, if I passed XML that matches this schema, then JAXB will correctly undo it in the object tree.

The problem occurs when I pass XML with external DTD links, for example

 <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE foobar SYSTEM "http://blahblahblah/foobar.dtd"> <foobar></foobar> 

When disassembling something like this, the SAX analyzer tries to load the remote object (" http: //somehost/foobar.dtd "), despite the fact that this fragment clearly does not correspond to the scheme that I compiled earlier using XJC.

To get around this behavior, since I know that any consistent XML (according to compiled XSD) will never require loading a remote object, I have to define my own EntityResolver, which closes the load on all remote legal entities. Therefore, instead of doing something like:

 MyClass foo = (MyClass) myJAXBContext.createUnmarshaller().unmarshal(myReader); 

I am forced to do this:

 XMLReader myXMLReader = mySAXParser.getXMLReader(); myXMLReader.setEntityResolver(myCustomEntityResolver); SAXSource mySAXSource = new SAXSource(myXMLReader, new InputSource(myReader)); MyClass foo = (MyClass) myJAXBContext.createUnmarshaller().unmarshal(mySAXSource); 

So my last question is:

When disassembling with JAXB, if loading remote objects using the SAX parser is automatically a short circuit, when can the XML in question be invalidated without loading these deleted objects?

Also, doesn't that seem like a security issue? Given that JAX-WS relies on JAXB under the hood, it seems that I can pass specially crafted XML to any JAX-WS web service and force the WS host to load any arbitrary URLs.

I'm a relative newbie to this, so something is probably missing me. Please let me know if so!

+4
source share
1 answer

Well thought out question, it deserves an answer :)

Some notes:

  • JAXB runtime is independent of XML Schema. It uses the SAX parser to generate the SAX event stream, which it uses to bind to the object model. This object model can be written manually or can be generated from a circuit using XJC, but the binding and runtime are very different from each other. This way, you may know that a good XML input matches the schema at runtime, but JAXB does not.
  • Forcing the download of a remote DTD link is not a security hole. If at the end of this there is a real DTD, the worst case is that it will not check. If this is not a true DTD, then it will be ignored.
  • DTD is considered deprecated, and therefore there is no direct support for it in the high-level JAXB API. If you need EntityResolver , you need to delve into the SAX API that you have already done.
  • If your class model was generated from an XML schema, then you should consider checking it at runtime using SchemaFactory and Unmarshaller.setSchema() . This will instruct Xerces to check SAX events against the circuit before passing the JAXB. This does not stop the choice of DTD, but adds a level of security, which, as you know, is good.
+4
source

All Articles