How to convert UTF-8 character to ISO Latin 1?

I need to convert the UTF-8 trademark to ISO Latin 1 and store it in a database that is also encoded according to ISO Latin 1.

How can i do this in java?

I tried something like

String s2 = new String(s1.getBytes("ISO-8859-1"), "utf-8");

but it does not seem to work as I expected.

+5
source share
4 answers

A string in Java is always in Unicode (UTF-16, efficiently). Conversions are only necessary when you are trying to switch from text to binary encoding or vice versa.

? , ISO Latin 1? , , - . , " UTF-8". ", UTF-8", , .

EDIT: Unicode U + 2122, ISO-Latin-1. U + 00AE, ( , IIRC), , - , :

string replaced = original.replace('\u2122', '\u00ae');
+5

, ( s1), Latin-1, , ISO-8859-1.

  • -, , , .
    , CP1252 ISO-8859-1 (1 )

  • , , , .
    UTF-8 ISO-8859-1:

    String s2 = new String(s1.getBytes("UTF-8"), "ISO-8859-1");
    

    , s2 characher, ISO-8859-1 , UTF-8.

    ,

    String s1 = new String(s2.getBytes("ISO-8859-1"),"UTF-8");
    

!. , ISO-8859-1.. . ...

, , ISO-8859-1 . , 80 9F.

byte[] b = { -97, -100, -128 };
System.out.println( new String(b,"ISO-8859-1") );

???

Java s.getBytes("ISO-8859-1") .

+4
  • , . , , ( UTF-8 , ISO-8859-1, ).
  • ISO-8859-1 (a.k.a Latin1) "™".
+2

I had a similar problem and it was solved by converting untranslatable characters to Entitys. If you show the information later as html, you're okay anyway.

If not, you can try converting them back to Unicode.

python trademark example:

s = u'yellow bananas\u2122'.encode('latin1', 'xmlcharrefreplace')
# s is 'yellow bananas™'
0
source

All Articles