In a recent web application that I built, I was pleasantly surprised when one of our users decided to use it to create something entirely in Japanese. However, the text was wrapped strangely and awkwardly. Browsers apparently cannot handle the very convenient packaging of Japanese text, possibly because it contains several spaces, as each character forms a whole word. However, this is not a completely safe assumption to make, since some words are constructed from several characters, and it is not safe to break some groups of characters into different lines.
Google googling did not help me better understand the problem. It seems to me that you need a dictionary of indestructible templates, and suppose that everywhere you can safely break. But I'm afraid that I donβt know enough about the Japanese to really know all the words that I understand from some of my searches, itβs quite difficult.
How do you approach this problem? Are there any libraries or algorithms that you know of that already exist that handle this in a satisfactory way?
algorithm unicode internationalization word-wrap cjk
Breton
source share