Ligatures are Unicode characters that are represented by more than one code. For example, in Devanagari त्र there is a ligature, which consists of code points त + ् + र .
When viewed in simple text file editors such as Notepad, त्र displayed as त् + र and saved as three Unicode characters. However, when the same file opens in Firefox, it appears as the correct ligature.
So my question is how to programmatically define such ligatures by reading a file from my code. Because Firefox does this, there must be a way to do this programmatically. Are there any Unicode properties that contain this information, or do I need to have a map for all such ligatures?
The SVG CSS text-rendering property, when set to optimizeLegibility , does the same (combines code into the correct ligature).
PS: I use Java.
EDIT
The purpose of my code is to count the characters in Unicode text, assuming the ligature is the only character. So I need a way to collapse multiple code points into one ligature.
java text unicode clojure
Abhinav sarkar
source share