How can I learn a language from a character?

Given the Unicode character, we want to know which languages ​​include this character, and, more importantly, to understand if each language is Left-To-Right . For example, the character A can be either English or Spanish , which are LTR languages.

I want this for my own text editor. Can someone help me find an API function or something that solves my problem?

Thanks in advance

+7
source share
2 answers

Unicode-wise, LTR / RTL is a property of characters, not languages ​​that use that character. This is important because embedded English in the Arabic text should be displayed from left to right, even if for simplicity the document as a whole can be marked as Arabic. If you use the JCL , these properties can be obtained using UnicodeIsLeftToRight and UnicodeIsRightToLeft . Note that characters cannot be left to right or right to left, and also note that the JCL uses a personal copy of the Unicode character list, which may be slightly different than any specific version of Windows.

+7
source

Regarding the question in the title, you will need to conduct an extensive study of the use of symbols in the languages ​​of the world. There are several thousand languages, although many of them do not have a regular writing system; on the other hand, in some languages ​​there are several writing systems. Different language options may have different character repertoires.

Thus, it will be a serious effort, although some data has been compiled, for example. CLDR’s repertoire, but the concept of “characters used in the language” is far from clear. (Are the characters æ, è, and ö used in English? They necessarily appear in some forms of written English.)

Thus, it would be unrealistic to expect that a library routine would be found for such purposes.

Apparently, your real need was to determine if the symbol is a symbol from left to right or right to left. But for completeness, I gave an answer to what you actually asked, and this may be relevant in some other contexts.

+1
source

All Articles