I came across what, in my opinion, is a problem with the BinaryReader.ReadChars () method. When I wrap a BinaryReader around a raw NetworkStream socket, sometimes I get a stream corruption when the stream that is being read goes out of sync. This stream contains messages in a binary serialization protocol.
I traced it to the next
- This only happens when reading a Unicode string (encoded using Encoding.BigEndian)
- This only happens when the corresponding line is split into two tcp packets (confirmed with wirehark)
I think the following happens (in the context of the example below)
I think the root is a bug in .NET code that uses charsRemaining * bytes / char every loop to calculate the remaining bytes. Due to the extra byte hidden in the decoder, this calculation can be disabled alone, causing the extra byte to be consumed from the input stream.
This is the .NET framework code.
while (charsRemaining>0) { // We really want to know what the minimum number of bytes per char // is for our encoding. Otherwise for UnicodeEncoding we'd have to // do ~1+log(n) reads to read n characters. numBytes = charsRemaining; if (m_2BytesPerChar) numBytes <<= 1; numBytes = m_stream.Read(m_charBytes, 0, numBytes); if (numBytes==0) { return (count - charsRemaining); } charsRead = m_decoder.GetChars(m_charBytes, 0, numBytes, buffer, index); charsRemaining -= charsRead; index+=charsRead; }
I'm not quite sure if this is a mistake or just a misuse of the API. To get around this problem, I simply compute the necessary bytes by reading them, and then run the byte [] through the corresponding Encoding.GetString (). However, this will not work for UTF-8.
To be interested in hearing people's thoughts about this and that I am doing something wrong or not. And perhaps this will save the next person a few hours / days of tedious debugging.
EDIT: sent to connect Connect tracking element
Mike q
source share