The naive Perl 6 program is not safe in the opposite direction with respect to Unicode. It looks like it internally uses a Formulation Form (NFC) for type Str:
$ perl -CO -E 'say "e\x{301}"' | perl6 -ne '.say' | perl -CI -ne 'printf "U+%04x\n", ord for split //' U+00e9 U+000a
Sneaking into the documents, I see nothing about this behavior, and I find it very shocking. I canβt believe that you need to go back byte level before the text back and forth:
$ perl -CO -E 'say "e\x{301}"' | perl6 -e 'while (my $byte = $*IN.read(1)) { $*OUT.write($byte) }' | perl -CI -ne 'printf "U+%04x\n", ord for split //' U+0065 U+0301 U+000a
Do all text files need to be in NFC to communicate securely with Perl 6? What if the document should be in the NFD? I have to miss something. I can't believe this is deliberate behavior.
unicode perl6
Chas. Owens
source share