JAXB error explanation: invalid byte 1 from 1-byte sequence of UTF-8

We parse an XML document using JAXB and get this error:

[org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence.]
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(AbstractUnmarshallerImpl.java:315)

What exactly does this mean and how can we solve it?

We execute the code as:

jaxbContext = JAXBContext.newInstance(Results.class);
Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
unmarshaller.setSchema(getSchema());
results = (Results) unmarshaller.unmarshal(new FileInputStream(inputFile));

Update

The problem is due to this funny character in the XML file: ¿

Why can this cause such a problem?

Update 2

There are two of these strange characters in the file. They are around the middle of the file. Please note that the file is created based on the data in the database, and these strange characters somehow got into the database.

Update 3

Here is the complete XML snippet:

<Description><![CDATA[Mt. Belvieu ¿ Texas]]></Description>

Update 4

Please note that there is no title <?xml ...?>.

HEX for special character - BF

+5
3

, , JAXB XML <?xml ...?> UTF-8, (, ISO-8859-1 Windows-1252, 0xBF ¿).

, <?xml ...?> UTF-8 .

, InputStreamReader , ( ) JAXB :

results = (Results) unmarshaller.unmarshal(
   new InputStreamReader(new FileInputStream(inputFile), "ISO-8859-1")); 

- <?xml ...?> .

+3

, " " (BOM) UTF. , , .net.

, Reader, InputStream:

results = (Results) unmarshaller.unmarshal(new FileReader(inputFile));

A Reader UTF- . , File Unmarshaller, JAXBContext :

results = (Results) unmarshaller.unmarshal(inputFile);
+1

It sounds like your XML is encoded using UTF-16, but this encoding is not passed to Unmarshaller. With Marshaller, you can set this with marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-16");, but since Unmarshaller is not required to support any properties, I am not sure how to ensure this is done, other than ensuring that your XML document has an encoding="UTF-16"original element <?xml?>.

0
source

All Articles