No, you cannot use CDATA to embed binary data in an XML file.
In XML1.0 (since XML 1.1 is more permissive, but not about control characters), the following restrictions apply to CDATA characters:
CData ::= (Char* - (Char* ']]>' Char*)) Char ::=
This means that there are several illegal characters, including:
- illegal XML control characters from 0x00 to 0x20, except newlines, carriage returns, and tabs
- UTF-8 illegal sequences such as 0xFF or noncanonical 0b1100000x 0b10xxxxxx
In addition to this, in a standard object without CDATA:
- "<" and ">" are illegal
- "&" usage is limited (
é OK, &zajdalkdza; no)
Thus, CDATA is just a way to resolve "<", ">" and "&" by restricting it to β]]>" instead. It does not solve the problem with illegal XML, Unicode and UTF-8, which is the main problem.
Solutions:
- Use Base64 with 33% overhead, but great support in all programming languages ββand the fact that this is standard
- Use BaseXML with still limited implementations, but with only 20% overhead
- Do not encode binary data in XML; if possible, translate it separately
KrisWebDev
source share