How to write using the encoding of a single byte character?

I have a web service that returns a configuration file to a low level hardware device. The manufacturer of this device tells me that it only supports single-byte characters for this configuration file.

In this wiki page, I found out that the following must be single-byte character sets:

  • ISO 8859
  • ISO / IEC 646 (I could not find this here )
  • various Microsoft / IBM code pages

But when I call Encoding.GetMaxByteCount (1) on these character sets, it always returns 2.

I also tried various other encodings (e.g. IBM437), but GetMaxByteCount also returns 2 for other character sets.

The Endoding.IsSingleByte method seems unreliable according to this

You have to be careful what your application does with a value for IsSingleByte. The assumption of how coding will continue may be wrong. For example, Windows-1252 is true for Encoding.IsSingleByte, but Encoding.GetMaxByteCount (1) returns 2. This is because the method considers the potential remaining surrogates from the previous decoder mode.

Also, the Encoding.GetMaxByteCount method has some of the same problems according to this

Note that GetMaxByteCount considers potential remaining surrogates from a previous decoder operation. Due to the decoder, transmitting a value from 1 to the method retrieves 2 for single-byte encoding, such as ASCII. Your application should use the IsSingleByte property if information is needed.

Because of this, I am no longer sure what to use.

Further reading .

+6
source share
2 answers

Basically, GetMaxByteCount considers a boundary register that you probably will never need in regular code, in particular what it says about decoder and surrogates. The fact is that some code codes are encoded as surrogate pairs, which in unsuccessful cases may mean that it divides the two calls to GetBytes() / GetChars (at the encoder / decoder). As a result, an implementation could theoretically have one byte / character still buffered and awaiting processing, so GetMaxByteCount needs to be warned about this.

However! All this makes sense only if you use the encoder / decoder directly. If you use operations with Encoding , for example Encoding.GetBytes , then all this is abstracted from you, and you will never need to know. In this case, just use IsSingleByte and everything will be fine.

+6
source

Maybe you should use the example from the page of the .Convert encoding method on MSDN

The Encoding.Convert method must contain an ASCII encoded string. I hope one byte.

0
source

Source: https://habr.com/ru/post/925905/


All Articles