Java String incorrect concatenation sequence of different languages

Screen capture of strange Java string behavior

So, as you can see from the image, I combined a, c and b. And I get the expected result. But in the second println, when I concatenated a, e and b, I got concatenation at the end, and not where I expected. I want to know the reason for this behavior and the solution to this behavior. Thank you in advance.

import java.util.*; public class prob { public static void main(String... args) { String a="الف",b="1/2",c="Ψ¨",e="B"; System.out.println(a+" : "+c+" : "+b); System.out.println(a+" : "+e+" : "+b); } } 

EDIT (to explain why my question is not a duplicate): My question is converting L2R languages ​​to R2L.

+7
java string unicode concatenation
source share
2 answers

This is because the first character is R2L (right-to-left orientation, as in Asian languages), so the following character appears at the beginning (correct orientation):

First char:

 الف // actual orientation ← 

The second char is added to L

 // add ← B : الف // actual orientation β†’ 

After that, B is L2R, as usual in Europe, so the following char (1/2) is added to the right orientation AFTER B:

 // β†’ add in this direction B : 1/2 : الف // actual orientation β†’ (still) 

You can easily test it by copying the char paste and manually directing it manually, you will see how the orientation changes depending on the entered char.


UPDATE:

What is my solution for this problem, because I did this example only to show what problem I encountered when creating large reports, where data is sometimes used, it is L2R String, and sometimes R2L. And I want to make a string in exactly this format. (

From this answer :

  • Insert from left to right (U + 202A)
  • Right to left insert (U + 202B)
  • Formatting Pop Destinations (U + 202C)

So, in java, in order to implement an RTL language such as Arabic in LTR, like English, you would do

 myEnglishString + "\u202B" + myArabicString + "\u202C" + moreEnglish 

and do the reverse

 myArabicString + "\u202A" + myEnglishString + "\u202C" + moreArabic 

See (for source material)


ADD ON 2:

 char l2R = '\u202A'; System.out.println(l2R + a + " : " + e +" : "+b); 

OUTPUT:

 β€ͺالف : B : 1/2 
+10
source share

The reason, as already mentioned in this, is that some line has an orientation from right to left.

You can manually set the letf-right-right orientation for a line with a right-to-left orientation, with a \u200e control character , for example:

 String a="\u200eالف",b="1/2",c="\u200eΨ¨",e="B"; System.out.println(a+" : "+c+" : "+b); System.out.println(a+" : "+e+" : "+b); 
+3
source share

All Articles