Why are my UTF8 data from my mod_perl application still distorted in a web browser?

Before starting, I would like to outline the structure of what I work with.

  • There is a text file from which specific text is taken. File encoded in utf-8
  • Perl takes the file and prints it on the page. Everything is displayed as it should be. Perl configured to use utf-8
  • The following header is created on the Perl web page <meta content="text/html;charset=utf-8" http-equiv="content-type"/> . Hence utf-8
  • After the first boot, everything loads dynamically through jQuery / AJAX. Turning the pages, you can download the same text, only this time it loads JavaScript. The request has the following Content-Type: application/x-www-form-urlencoded; charset=UTF-8 header Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Content-Type: application/x-www-form-urlencoded; charset=UTF-8
  • The Perl handler that processes the AJAX request to the backend delivers content to utf-8
  • The AJAX handler calls the function in our custom Framework. Before the Framework prints the text, it will correctly display as "üöä". After sending it to the AJAX handler, it reads "x {c3} \ x {b6} \ x {c3} \ x {a4} \ x {c3} \ x {bc}", which is a utf-8 representation of "UOA".
  • After the AJAX handler delivers its package to the client as JSON, the web page prints the following: "öäü".
  • JS and Perl files are saved in utf-8 (default setting in Eclipse)

These are the symptoms. I tried everything Google told me and I still have a problem. Does anyone know what this could be? If you need any specific piece of code, tell me about it and I will try to insert it.

Change 1

AJAX handler response header

 Date: Mon, 09 Nov 2009 11:40:27 GMT Server: Apache/2.2.10 (Linux/SUSE) Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html; charset="utf-8" 200 OK 

Answer

With the help of you people and this page , I was able to track the problem. It seems that the problem was not in the encoding itself, but in Perl, which encodes my $ text variable twice as utf-8 (according to the site). The solution was as simple as adding Encode :: decode_utf8 ().

At first I searched in a completely wrong place. I thank everyone who helped me find the right place :)

# covers some sublime love #

+4
source share
2 answers

returns the following: & 38; & 65; <116 & 105 & 108 & 100 & 101 & 59 & 38 & 112 & 97; & 114; & 97; & 59; ...

It:

 &Atilde;&para;&Atilde;&curren;&Atilde;&frac14; 

Which says your AJAX handler uses the HTML entity encoding function for its output, which accepts input from the ISO-8859-1 character set. You can use a character encoder that knew about UTF-8 instead, but it would probably be easier to just code the potentially special characters <>&"' and others.

The request has the following Content-Type header: application / x-www-form-urlencoded; encoding = UTF-8

There is no parameter like charset for the MIME type application/x-www-form-urlencoded . This will be ignored. Form-encoded strings are essentially byte-based; it is up to the application to decide which character set they process (if any, maybe the application just wants bytes).

+6
source

This is not a response like a debugging suggestion. The first thing that comes to mind is to try to send HTML objects like &#1234; , instead of utf-8 codes. To get Perl to send them, a module is required, or you can just do

  my $text =~ s/(.)/"&#" . ord ($1) . ";"/ge; 

What seems to me the most likely cause of this problem is that the end of JavaScript is not able to understand encoded UTF-8 from Perl.

+2
source

All Articles