Is there a Unicode string that grows longer when converted to lowercase?

Question

Is there a Unicode string that grows longer when converted to lowercase?

The string 'ß'grows longer (as measured at Unicode code points) when converted to uppercase (it becomes 'SS').

Is there a similar string that gets longer when converted to lowercase?

+4

string unicode

Hammerite Feb 23 '15 at 21:29

source share

2 answers

http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt

, . , , J.

+2

Necreaux 23 . '15 21:39

Adam · Accepted Answer · 2015-02-23T21:49:45+0000

If I understood correctly, this Java finds when the uppercase version is longer than the original

for (char chr = 0; chr < Character.MAX_VALUE; chr++) {
    String str = String.valueOf(chr);
    String upper = str.toUpperCase();
    if (upper.length() > 1) {
        System.out.println(String.format("%s => %s (%d)", str,
                Arrays.toString(upper.toCharArray()), upper.length()));
    }
}

Which brings out things like your original example

ß => [S, S] (2)
ŉ => [ʼ, N] (2)
ǰ => [J, ̌] (2)
ΐ => [Ι, ̈, ́] (3)

If I change this to toLowerCase (), there will be only one result

İ => [i, ̇] (2)

Is there a Unicode string that grows longer when converted to lowercase?

More articles: