Why don't .ord disagree with .chars?

Question

Why don't .ord disagree with .chars?

My understanding of .chars is that it returns the number of characters per line in graphemes . My understanding of .ords is that it returns a "list of code numbers, one for the base character of each grapheme in a string . " That is .chars returns the number of graphemes, and .ords returns one code (base) per grapheme. However, the behavior that I observe in Rakudo 2016.07.1 on MoarVM 2016.07 does not seem to match this:

 > "\x[2764]\x[fe0e]".chars 1 > "\x[2764]\x[fe0e]".ords.fmt("U+%04x") U+2764 U+fe0e > "e\x[301]".ords.fmt("U+%04x") U+00e9 > "0\x[301]".ords.fmt("U+%04x") U+0030

The .chars method returns .chars 1 for HEAVY BLACK HEART and VARIATION SELECTOR-15 (text representation , not emoji ❤️, U + 2764 U + fe0f), but then .ords returns both code points than just the base (I expected only U + 2764). Even more confusing, if you call .ords on LATIN SMALL LETTER E and COMBINING ACUTE ACCENT, you will return U + 00e9 (LATIN SMALL LETTER E WITH ACUTE). I was expecting U + 0065, since LATIN SMALL LETTER E is the base code. I will return the expected result when there is no version of the NFC string (for example, U + 0030 for 0).

Is my understanding of .chars and .ords just wrong, or is it a mistake?

+7

unicode perl6

Chas. Owens 20 sept '16 at 20:47

source share

1 answer

Coke · Accepted Answer · 2016-09-21T13:49:43+0000

Documentation error regarding .ords method. One of the main developers has just updated the documents with this commit:

https://github.com/perl6/doc/commit/12ec5fc35e

What should appear on the site in the near future.

Why don't .ord disagree with .chars?

More articles: