UTF8 bytes [] for string conversion

I have UTF8 byte[] infinite size (i.e. very large size). I want to trim it to only 1024 bytes and then convert it to a string.

Encoding.UTF8.GetString(byte[], int, int) does this for me. First, it truncates 1024 bytes and then returns me its converted string.

But in this conversion, if the last character has a UTF8 character set that consists of 2 bytes and whose first byte falls into the range and the other bytes out of the range, then does it display ? for this character in the converted string.

Is there any way to get this ? didn’t come in the converted string?

+6
source share
1 answer

This is the Decoder class. It allows you to transfer byte data to char data, while maintaining sufficient state for the correct processing of partial code points:

 Encoding.UTF8.GetDecoder().GetChars(buffer, 0, 1024, charBuffer, 0) 

Of course, when the code point is split in the middle, the Decoder remains with the "partial char" in its state, but this does not concern you in your case (and preferably in all other use cases :)).

+6
source

All Articles