Is there any way to specify character encoding in java.lang.StringBuilder

Question

Is there any way to specify character encoding in java.lang.StringBuilder

Or am I stuck with:

String s = new String(new byte[0], Charset.forName("ISO-8859-1")); // or ISO_8859_1, or LATIN-1 or ... still no constants for those for (String string : strings) { // those are ISO-8959-1 encoded s += string; // hopefully this preserves the encoding (?) }

+7

java character-encoding

Mr_and_Mrs_D Jul 28 '13 at 11:29

source share

2 answers

How to use a cached converter:

 public static void main(String args[]) throws IOException { ByteArrayOutputStream baos = new ByteArrayOutputStream(1<<10); OutputStreamWriter osw = null; try { osw = new OutputStreamWriter(baos, "UTF-8"); } catch (UnsupportedEncodingException ex) { } osw.write("!"); osw.flush(); System.out.println("Hello: " + baos.toString("UTF-8")); }

+2

gavenkoa 25 sept. '13 at 15:46

source share

Jon skeet · Accepted Answer · 2013-07-28T11:30:45+0000

Strings are always encoded by UTF-16 in Java. These are just sequences of char values that are UTF-16 code units. When you specify the encoding in the String(byte[], String) constructor String(byte[], String) , it just says how to decode bytes to text - after that, the encoding is discarded.

If you need to keep the encoding, you need to create your own class to save the Charset and String together. I can’t say that I ever wanted to do this - are you really sure what you need?

(Thus, your “stuck” code will not work anyway - and it will also be inefficient.)

Is there any way to specify character encoding in java.lang.StringBuilder

More articles: