Get unicode character value

Is there any way in Java so that I can get the Unicode equivalent of any character? eg.

Assume the getUnicode(char c) method. The call to getUnicode('รท') should return \u00f7 .

+58
java unicode
Feb 08 2018-10-02T00
source share
7 answers

You can do this for any Java char using one liner here:

 System.out.println( "\\u" + Integer.toHexString('รท' | 0x10000).substring(1) ); 

But it will only work for Unicode characters before Unicode 3.0, so I set myself the task of doing this for any Java char.

Since Java was developed before the advent of Unicode 3.1, and therefore the Java char primitive is inadequate to represent Unicode 3.1 and higher: there is no longer one Unicode character for one Java char (a monstrous hack is used instead).

So, you really need to check your requirements here: do you need to support Java char or any possible Unicode character?

+47
Feb 08 '10 at 9:07
source share

If you have Java 5, use char c = ...; String s = String.format ("\\u%04x", (int)c); char c = ...; String s = String.format ("\\u%04x", (int)c);

If your source is not a Unicode character ( char ), but a string, you must use charAt(index) to get the Unicode character at the index position.

Do not use codePointAt(index) because this will return 24-bit values โ€‹โ€‹(full Unicode) that cannot be represented by only four hexadecimal digits (it needs 6). See docs for explanation .

[EDIT] To make this clear: this answer does not use Unicode, but a method that uses Java to represent Unicode characters (ie surrogate pairs), since char is 16 bits and Unicode is 24 bits. The question should be: "How to convert char to a 4-digit hexadecimal number", since it (really) does not apply to Unicode.

+29
Feb 08 '10 at 9:13
source share
 private static String toUnicode(char ch) { return String.format("\\u%04x", (int) ch); } 
+8
Aug 07 '13 at 8:20
source share
 char c = 'a'; String a = Integer.toHexString(c); // gives you---> a = "61" 
+3
Jun 11 '14 at 14:29
source share

I found this nice code on the web.

 import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; public class Unicode { public static void main(String[] args) { System.out.println("Use CTRL+C to quite to program."); // Create the reader for reading in the text typed in the console. InputStreamReader inputStreamReader = new InputStreamReader(System.in); BufferedReader bufferedReader = new BufferedReader(inputStreamReader); try { String line = null; while ((line = bufferedReader.readLine()).length() > 0) { for (int index = 0; index < line.length(); index++) { // Convert the integer to a hexadecimal code. String hexCode = Integer.toHexString(line.codePointAt(index)).toUpperCase(); // but the it must be a four number value. String hexCodeWithAllLeadingZeros = "0000" + hexCode; String hexCodeWithLeadingZeros = hexCodeWithAllLeadingZeros.substring(hexCodeWithAllLeadingZeros.length()-4); System.out.println("\\u" + hexCodeWithLeadingZeros); } } } catch (IOException ioException) { ioException.printStackTrace(); } } } 

Original article

0
Feb 08 '10 at 8:45
source share

You are picky about using Unicode because with java its easier if you write your program to use the value "dec" or (HTML code), then you can just use the data types between char and int

 char a = 98; char b = 'b'; char c = (char) (b+0002); System.out.println(a); System.out.println((int)b); System.out.println((int)c); System.out.println(c); 

Gives this way out

 b 98 100 d 
0
Feb 26 '15 at 3:33
source share

First, I get the top side of char. Get the bottom side later. Convert all things to HexString and set the prefix.

 int hs = (int) c >> 8; int ls = hs & 0x000F; String highSide = Integer.toHexString(hs); String lowSide = Integer.toHexString(ls); lowSide = Integer.toHexString(hs & 0x00F0); String hexa = Integer.toHexString( (int) c ); System.out.println(c+" = "+"\\u"+highSide+lowSide+hexa); 
0
Apr 12 '15 at 21:47
source share



All Articles