Perl: Decoding Unicode "Distorted" Strings

I am working on a CGI script that is being called from a piece of software (which I cannot change). The variables provided by the software give me problems, because if they contain non-ascii characters, they look like this:

ÿFFFFDEetta er texti meÿFFFFF0 ÿFFFFEDslenskum stÿFFFFF6fum

instead

Þetta er texti með íslenskum stöfum .

I tried messing around with the Encode::decode , but nothing came of it - all I have to do is change the way is presented.

So yes, I'm a bit stumped. What should I do to change all ÿFFFFDE to Þ , etc., without resorting to replacing each character without ascii separately (which is not a solution, because it should work in languages ​​that I don’t even speak)?

+4
source share
1 answer
 use Encode qw(decode); use Encode::Escape qw(); $_ = 'ÿFFFFDEetta er texti meÿFFFFF0 ÿFFFFEDslenskum stÿFFFFF6fum'; s/ÿFFFF/\\x/g; decode('iso-8859-1', decode('unicode-escape', $_)); # returns 'Þetta er texti með íslenskum stöfum' 
+7
source

All Articles