Utf-8 characters are displayed as ISO-8859-1

I have a problem with inserting / reading utf8 content from db. All the checks that I do seem to indicate that the contents in my database should be encoded in utf8, but it seems to be encoded in Latin. Data is originally imported from a PHP script from the CLI.

Configuration:

Zend Framework Version: 1.10.5 mysql-server-5.0: 5.0.51a-3ubuntu5.7 php5-mysql: 5.2.4-2ubuntu5.10 apache2: 2.2.8-1ubuntu0.16 libapache2-mod-php5: 5.2.4-2ubuntu5.10 

Vertifications:

-mysql:

 mysql> SHOW VARIABLES LIKE 'character_set%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec) mysql> SHOW VARIABLES LIKE 'collation%'; +----------------------+-----------------+ | Variable_name | Value | +----------------------+-----------------+ | collation_connection | utf8_general_ci | | collation_database | utf8_bin | | collation_server | utf8_general_ci | +----------------------+-----------------+ 

-database

 created with CREATE DATABASE mydb CHARACTER SET utf8 COLLATE utf8_bin; CREATE SCHEMA `mydb` DEFAULT CHARACTER SET utf8 COLLATE utf8_bin ; mysql> status; -------------- mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) using readline 5.2 Connection id: 7 Current database: mydb Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server version: 5.0.51a-3ubuntu5.7-log (Ubuntu) Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: utf8 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /var/run/mysqld/mysqld.sock Uptime: 9 min 45 sec 

-sql: before doing my insertions I run

 SET names 'utf8'; 

-php: before doing my insertions, I use utf8_encode () and mb_detect_encoding () , which gives me "UTF-8". After extracting the content from db and before sending it to the user, mb_detect_encoding () also gives "UTF-8"

Verification Check:

the only way for me to display the content correctly is to set the content type to Latin (if I sniff the traffic, I see the content header with ISO-8859-1):

 ini_set('default_charset', 'ISO-8859-1'); 

This test shows that the content comes out as Latin. I do not understand why. Somebody knows?

Thanks.

+4
source share
2 answers

Well, I found that SET NAMES is actually not that good. Take the rush papers ...

I usually do 4 queries:

 SET CHARACTER SET 'UTF8'; SET character_set_database = 'UTF8'; SET character_set_connection = 'UTF8'; SET character_set_server = 'UTF8'; 

Give this snapshot and see if it does for you ...

Oh, and remember that all UTF-8 characters <= 127 are valid ISO-8859-1 characters. Therefore, if there are only <= 127 characters in the stream, mb_detect_encoding will fall on a higher set of prevalence characters (the default is "UTF-8") ...

+8
source
  • What do you do before the search ? Also "SET NAMES utf8;"? Otherwise, MySQL will silently convert to the encoding that the connection indicates when using.
  • If not even that, SHOW FULL COLUMNS FROM table; shows SHOW FULL COLUMNS FROM table; ? Having a table with a default encoding does not mean a column. i.e. this is true:

.

 CREATE TABLE test ( `name` varchar(10) character set latin1 ) CHARSET=utf8 
+1
source

Source: https://habr.com/ru/post/1316571/


All Articles