Convert char [] to byte []

I would like to convert an array of characters to an array of bytes in Java. What are the methods for conversion?

+67
java arrays type-conversion
Apr 01 2018-11-12T00:
source share
6 answers
char[] ch = ? new String(ch).getBytes(); 

or

 new String(ch).getBytes("UTF-8"); 

to get a non-default encoding.

Update: Since Java 7: new String(ch).getBytes(StandardCharsets.UTF_8);

+67
Apr 01 '11 at 12:10
source share

Convert without creating a String object:

 import java.nio.CharBuffer; import java.nio.ByteBuffer; import java.util.Arrays; byte[] toBytes(char[] chars) { CharBuffer charBuffer = CharBuffer.wrap(chars); ByteBuffer byteBuffer = Charset.forName("UTF-8").encode(charBuffer); byte[] bytes = Arrays.copyOfRange(byteBuffer.array(), byteBuffer.position(), byteBuffer.limit()); Arrays.fill(byteBuffer.array(), (byte) 0); // clear sensitive data return bytes; } 

Using:

 char[] chars = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9'}; byte[] bytes = toBytes(chars); /* do something with chars/bytes */ Arrays.fill(chars, '\u0000'); // clear sensitive data Arrays.fill(bytes, (byte) 0); // clear sensitive data 

The solution is based on Swing's recommendation to store passwords in char []. (See Why is char [] preferable to String for passwords? )

Remember to write sensitive data to the logs and make sure that the JVM will not contain links to them.




The code above is correct, but not efficient. If you do not need performance, but you need security, you can use it. If security isn't a goal either, then just String.getBytes . The code above is not efficient if you look down the encode implementation in the JDK. In addition, you need to copy arrays and create buffers. Another way to convert is embedded code, encode code (example for UTF-8 ):

 val xs: Array[Char] = "A ß € 嗨 𝄞 🙂".toArray val len = xs.length val ys: Array[Byte] = new Array(3 * len) // worst case var i = 0; var j = 0 // i for chars; j for bytes while (i < len) { // fill ys with bytes val c = xs(i) if (c < 0x80) { ys(j) = c.toByte i = i + 1 j = j + 1 } else if (c < 0x800) { ys(j) = (0xc0 | (c >> 6)).toByte ys(j + 1) = (0x80 | (c & 0x3f)).toByte i = i + 1 j = j + 2 } else if (Character.isHighSurrogate(c)) { if (len - i < 2) throw new Exception("overflow") val d = xs(i + 1) val uc: Int = if (Character.isLowSurrogate(d)) { Character.toCodePoint(c, d) } else { throw new Exception("malformed") } ys(j) = (0xf0 | ((uc >> 18))).toByte ys(j + 1) = (0x80 | ((uc >> 12) & 0x3f)).toByte ys(j + 2) = (0x80 | ((uc >> 6) & 0x3f)).toByte ys(j + 3) = (0x80 | (uc & 0x3f)).toByte i = i + 2 // 2 chars j = j + 4 } else if (Character.isLowSurrogate(c)) { throw new Exception("malformed") } else { ys(j) = (0xe0 | (c >> 12)).toByte ys(j + 1) = (0x80 | ((c >> 6) & 0x3f)).toByte ys(j + 2) = (0x80 | (c & 0x3f)).toByte i = i + 1 j = j + 3 } } // check println(new String(ys, 0, j, "UTF-8")) 

Sorry for using the Scala language. If you have problems converting this code to Java, I can rewrite it. Regarding performance, always check for real data (e.g. using JMH). This code looks very similar to what you can see in JDK [ 2 ] and Protobuf [ 3 ].

+139
Mar 12 2018-12-12T00:
source share

Edit: Andrey's answer has been updated, so it no longer applies.

Andrey's answer (the highest that was voted at the time of writing) is a bit incorrect. I would add this as a comment, but I'm not authoritative enough.

In response, Andrew:

 char[] chars = {'c', 'h', 'a', 'r', 's'} byte[] bytes = Charset.forName("UTF-8").encode(CharBuffer.wrap(chars)).array(); 

calling array () may not return the required value, for example:

 char[] c = "aaaaaaaaaa".toCharArray(); System.out.println(Arrays.toString(Charset.forName("UTF-8").encode(CharBuffer.wrap(c)).array())); 

exit:

 [97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 0] 

As you can see, zero byte is added. To avoid this, use the following:

 char[] c = "aaaaaaaaaa".toCharArray(); ByteBuffer bb = Charset.forName("UTF-8").encode(CharBuffer.wrap(c)); byte[] b = new byte[bb.remaining()]; bb.get(b); System.out.println(Arrays.toString(b)); 

exit:

 [97, 97, 97, 97, 97, 97, 97, 97, 97, 97] 

As the answer also referred to the use of passwords, this might be worth the blanking of an array that supports ByteBuffer (access via array ()):

 ByteBuffer bb = Charset.forName("UTF-8").encode(CharBuffer.wrap(c)); byte[] b = new byte[bb.remaining()]; bb.get(b); blankOutByteArray(bb.array()); System.out.println(Arrays.toString(b)); 
+17
Dec 16 '13 at 6:42
source share
 private static byte[] charArrayToByteArray(char[] c_array) { byte[] b_array = new byte[c_array.length]; for(int i= 0; i < c_array.length; i++) { b_array[i] = (byte)(0xFF & (int)c_array[i]); } return b_array; } 
0
Apr 16 '18 at 4:45
source share

In fact, char and byte can have different sizes in Java, since char can contain any Unicode character that can reach 16 bits.

-2
Apr 01 2018-11-11T00:
source share

You can make a way:

 public byte[] toBytes(char[] data) { byte[] toRet = new byte[data.length]; for(int i = 0; i < toRet.length; i++) { toRet[i] = (byte) data[i]; } return toRet; } 

Hope this helps

-four
Sep 25 '14 at 0:46
source share



All Articles