In Ruby, Javascript, and Java (others that I haven't tried), have Cyrillic characters ̆ ̆ ̄ Ï length 2. When I try to check the length of a string with these characters, I get a bad output value.
"̈".mb_chars.length
"̆".length
"Ӭ".length
Note that the strings are encoded in UTF-8, and each char behaves like a single character.
My question is why is this behavior and how can I get the string length correctly with these characters inside?
source
share