XmlWriter Encoding Issues

I have the following code:

MemoryStream ms = new MemoryStream(); XmlWriter w = XmlWriter.Create(ms); w.WriteStartDocument(true); w.WriteStartElement("data"); w.WriteElementString("child", "myvalue"); w.WriteEndElement();//data w.Close(); ms.Close(); string test = UTF8Encoding.UTF8.GetString(ms.ToArray()); 

XML is generated correctly; however my problem is the first character of the string "test": ï (char # 239), which makes it invalid for some xml parsers: where does this come from? What am I doing wrong?

I know that I can solve the problem simply by starting with the first character, but I would rather find out why it is there and not just fix the problem.

Thanks!

+6
xml encoding xmlwriter
source share
5 answers

One solution found here: http://www.timvw.be/generating-utf-8-with-systemxmlxmlwriter/

I missed this at the top:

 XmlWriterSettings xmlWriterSettings = new XmlWriterSettings(); xmlWriterSettings.Encoding = new UTF8Encoding(false); MemoryStream ms = new MemoryStream(); XmlWriter w = XmlWriter.Create(ms, xmlWriterSettings); 

Thanks for helping everyone!

+13
source share

The problem is that your author-created XML is UTF-16, while you are using UTF-8 to convert it to a string. Try instead:

 StringBuilder sb = new StringBuilder(); using (StringWriter writer = new StringWriter(sb)) using (XmlWriter w = XmlWriter.Create(writer)) { w.WriteStartDocument(true); w.WriteStartElement("data"); w.WriteElementString("child", "myvalue"); w.WriteEndElement();//data } string test = sb.ToString(); 
+2
source share

You can change the encodings as follows:

 w.Settings.Encoding = Encoding.UTF8; 
0
source share

They are all a bit inactive if you care about the byte order character used by the editors (for example, Visual Studio detects encoded XML and UTF8 syntax highlighting correctly).

Here's the solution:

 MemoryStream stream = new MemoryStream(); XmlWriterSettings settings = new XmlWriterSettings(); settings.Encoding = Encoding.UTF8; settings.Indent = true; settings.IndentChars = "\t"; using (XmlWriter writer = XmlWriter.Create(stream, settings)) { // ... write // Make sure you flush or you only get half the text writer.Flush(); // Use a StreamReader to get the byte order correct StreamReader reader = new StreamReader(stream,Encoding.UTF8,true); stream.Seek(0, SeekOrigin.Begin); result = reader.ReadToEnd(); } 

I have 2 fragments in full here

0
source share

All Articles