In what encoding does readpipe return the result of a executed command?

Here is a simple perl script that should write a utf-8 encoded file:

use warnings; use strict; open (my $out, '>:encoding(utf-8)', 'tree.out') or die; print $out readpipe ('tree ~'); close $out; 

I expected readpipe to return a utf-8 encoded string, since LANG is set to en_US.UTF-8 . However, looking at tree.out (despite the fact that the editor recognizes it as utf-8 encoding), all the distorted text is displayed.

If I changed >:encoding(utf-8) in the open statement to >:encoding(latin-1) , the script creates a utf-8 file with the expected text.

This is a little strange for me. What is the explanation for this behavior?

+6
source share
1 answer

readpipe returns perl a string of unspecified bytes. We know that this string is encoded in UTF-8 encoding, but you did not tell Perl.

The I / O level on your output descriptor takes this line, assuming these are Unicode code codes and transcodes them as UTF-8 bytes.

The reason that the IO-Latin-1 level works correctly is because it writes every non-encoded byte unchanged, because the 1st 256-bit Unicode code works well with Latin-1.

The right thing to do is to decode byte string returned by readpipe to the line of code before feeding it to the IO layer. The use open ':utf8' statement, as Borodin mentioned, should be a viable solution since readpipe specifically mentioned on the open page .

+2
source

All Articles