ColdFusion: Invalid XML Char Control (hex)

I am trying to create an xml object using <cfxml> . I formatted all the data using XMLFormat() . There are some invalid characters in XML, such as "". I added these characters to xml doctype as follows:

 <!ENTITY raquo "ยป"> 

HTML text is not well formatted, but most of it works with my code. But some texts have some control characters. I get the following error:

An invalid XML character was found in the contents of the document element (Unicode: 0x13).

I tried adding unicode to doctype and I tried this solution . Both did not work ...

+4
source share
3 answers

Here's the actual cfscript code that clears our XML, there are two methods: one that clears higher international characters, and one that clears only the lower ASCII character, which violates our XML, if you find more characters, just extend the filtering rules.

 <cfscript> function cleanHighAscii(text){ var buffer = createObject("java", "java.lang.StringBuffer").init(); var pattern = createObject("java", "java.util.regex.Pattern").compile(javaCast( "string", "[^\x00-\x7F]" )); var matcher = pattern.Matcher(javaCast( "string", text)); while(matcher.find()){ var value = matcher.group(); var asciiValue = asc(value); if ((asciiValue == 8220) OR (asciiValue == 8221)) value = """"; else if ((asciiValue == 8216) || (asciiValue == 8217)) value = "'"; else if (asciiValue == 8230) value = "..."; else value = "&###asciiValue#;"; matcher.AppendReplacement(buffer, javaCast( "string", value )); } matcher.AppendTail(buffer); return buffer.ToString(); } function removeSubAscii(text){ return rereplaceNoCase(text, "\x1A","&###26#;", "all"); } function XMLSafe(text){ text = cleanHighAscii(text); text = removeSubAscii(text); return text; } </cfscript> 

Another posisbilty for CF10 user funciton encodeForXML ():

https://learn.adobe.com/wiki/display/coldfusionen/EncodeForXML

Either use the ESAPI that comes with CF10 directly, or add ESAPI banks to your senior CF from the OWASP website https://www.owasp.org/index.php/ESAPI_Overview :

 var esapi = createObject("java", "org.owasp.esapi.ESAPI"); var esapiEncoder = esapi.encoder(); return esapiEncoder.encodeForXML(text); 
+2
source

Try using &#187; instead of. ยป For example, this CFML:

 <cfxml variable="x"><?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE doc [ <!ENTITY raquo "&#187;"> ]> <doc> Hello, &raquo; ! </doc> </cfxml> <cfdump var="#x#"> 
0
source

Pass your XML string using this method and this will solve your problem.

It allows you to send only valid characters to the input, if you want to replace the disabled with a different character, you can change the method below to do this

 public String stripNonValidXMLCharacters(String in) { StringBuffer out = new StringBuffer(); // Used to hold the output. char current; // Used to reference the current character. if (in == null || ("".equals(in))) return ""; // vacancy test. for (int i = 0; i < in.length(); i++) { current = in.charAt(i); if ((current == 0x9) || (current == 0xA) || (current == 0xD) || ((current >= 0x20) && (current <= 0xD7FF)) || ((current >= 0xE000) && (current <= 0xFFFD)) || ((current >= 0x10000) && (current <= 0x10FFFF))) out.append(current); } return out.toString(); } 
-1
source

All Articles