How to return xml as UTF-8 instead of UTF-16

I use a procedure that serializes <T> . It works, but when I load it in the browser, I see a blank page. I can view the source of the page or open the download in a text editor, and I see xml, but in UTF-16 I think, why the browser pages are not displayed?

How to change my serialization procedure to return UTF-8 instead of UTF-16?

XML source returned:

 <?xml version="1.0" encoding="utf-16"?> <ArrayOfString xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <string>January</string> <string>February</string> <string>March</string> <string>April</string> <string>May</string> <string>June</string> <string>July</string> <string>August</string> <string>September</string> <string>October</string> <string>November</string> <string>December</string> <string /> </ArrayOfString> 

Example of calling a serializer:

 DateTimeFormatInfo dateTimeFormatInfo = new DateTimeFormatInfo(); var months = dateTimeFormatInfo.MonthNames.ToList(); string SelectionId = "1234567890"; return new XmlResult<List<string>>(SelectionId) { Data = months }; 

Serializer:

 public class XmlResult<T> : ActionResult { private string filename = DateTime.Now.ToString("ddmmyyyyhhss"); public T Data { private get; set; } public XmlResult(string selectionId = "") { if (selectionId != "") { filename = selectionId; } } public override void ExecuteResult(ControllerContext context) { HttpContextBase httpContextBase = context.HttpContext; httpContextBase.Response.Buffer = true; httpContextBase.Response.Clear(); httpContextBase.Response.AddHeader("content-disposition", "attachment; filename=" + filename + ".xml"); httpContextBase.Response.ContentType = "text/xml"; using (StringWriter writer = new StringWriter()) { XmlSerializer xml = new XmlSerializer(typeof(T)); xml.Serialize(writer, Data); httpContextBase.Response.Write(writer); } } } 
+8
c # xml utf-8 xml-serialization
Sep 08 '14 at 18:30
source share
2 answers

Response coding

I am not completely familiar with this part of the structure. But according to MSDN, you can set the encoding of the HttpResponse content as follows:

 httpContextBase.Response.ContentEncoding = Encoding.UTF8; 

The encoding that XmlSerializer sees

After reading your question again, I see that this is the hard part. The problem is using StringWriter . Since .NET strings are always stored as UTF-16 (reference ^^), StringWriter returns this as its encoding. Thus, the XmlSerializer writes the XML declaration as

 <?xml version="1.0" encoding="utf-16"?> 

To get around this, you can write to a MemoryStream as follows:

 using (MemoryStream stream = new MemoryStream()) using (StreamWriter writer = new StreamWriter(stream, Encoding.UTF8)) { XmlSerializer xml = new XmlSerializer(typeof(T)); xml.Serialize(writer, Data); // I am not 100% sure if this can be optimized httpContextBase.Response.BinaryWrite(stream.ToArray()); } 

Other approaches

Another edit: I just noticed this SO answer related to jtm001. The condensed solution is to provide the XmlSerializer custom XmlWriter that is configured to use UTF8 as the encoding.

Athari offers to extract from StringWriter and advertise encoding as UTF8.

As far as I understand, both solutions should work. I think the check out here is that you will need one type of template code or another ...

+4
Sep 08 '14 at 19:36
source share

You can use StringWriter, which will force UTF8. Here is one way to do this:

 public class Utf8StringWriter : StringWriter { // Use UTF8 encoding but write no BOM to the wire public override Encoding Encoding { get { return new UTF8Encoding(false); } // in real code I'll cache this encoding. } } 

and then use the Utf8StringWriter code in your code.

 using (StringWriter writer = new Utf8StringWriter()) { XmlSerializer xml = new XmlSerializer(typeof(T)); xml.Serialize(writer, Data); httpContextBase.Response.Write(writer); } 

the answer is inspired by Serializing an object as XML UTF-8 in .NET

+16
Sep 08 '14 at 19:46
source share



All Articles