Latin-1 / UTF-8 php encoding

I have a db encoded in UTF-8 with a mixture of Latin-1. (I think this is a problem)

Here's what the characters look like in the database.

ร„ยฐ (should be ฤฐ) รจ 

When I set the title to

 <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> 

Then the characters come out as:

  ฤฐ   

When I delete the header, they exit as they are in the database. I want them to go like this:

  ฤฐ รจ 

I am looking for a way to fix this in PHP after this is possible. Currently, I cannot correct the data, which would be correct.

+1
source share
4 answers

Your HTML output should be in the same encoding, there is no way around this. This means that content in different encodings must first be converted to HTML encoding. Although it is possible to do this with iconv or mb_convert_encoding , you need to solve two problems:

  • You need to know (or guess) the current encoding of the content
  • You need to do it manually everywhere

For example, a theoretical solution would be to select UTF-8 as the HTML encoding, and then do this for all the lines you are going to output:

 $string = '...'; // from the database // If it not already UTF-8, convert to it if (mb_detect_encoding($string, 'utf-8', true) === false) { $string = mb_convert_encoding($string, 'utf-8', 'iso-8859-1'); } echo $string; 

The code above assumes that the contents of non-UTF-8 are encoded in Latin-1, which is reasonable according to your question.

+8
source

Maybe you should choose utf8 as the connection character set that will correctly display the characters. The default value may be incorrect for your required characters.

More details here mysql_set_charset

+2
source

In this case, you need to collect 3 things. It doesn't matter what the character encoding of the contents of the database table is, because in MySQL you can set the character encoding between the database server and your PHP script. See http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html If you use SET NAMES / SET CHARACTER SET correctly, you can establish a connection to get UTF-8 characters.

You need to check the encoding of the "physical" (byte) character of your PHP script file. Install it in UTF-8 in a text editor / IDE depending on what you are using.

You need to use the appropriate HTML header, you wrote it correctly above.

If all things fit correctly, the result should be in order.

The only possible problem is when the text content in the database table was saved with the wrong char encoding.

+1
source

I know this is an old post, but in case something meets this problem, here is what I did to solve the problem.

1) export tables (tables) to sql

2) open sql using notepad ++ or another editor

3) copy everything and then paste it into a new specification file (or notebook and save as unicode)

4) I have this in my exported file:

  /*!40101 SET @ OLD_CHARACTER_SET_CLIENT=@ @CHARACTER_SET_CLIENT */; /*!40101 SET @ OLD_CHARACTER_SET_RESULTS=@ @CHARACTER_SET_RESULTS */; /*!40101 SET @ OLD_COLLATION_CONNECTION=@ @COLLATION_CONNECTION */; /*!40101 SET NAMES latin1 */; 

which i am changing SET NAMES from latin1 to utf8

  /*!40101 SET NAMES utf8 */; 

if you do not have this line, just add this new line and from

 CREATE TABLE IF NOT EXISTS `table_name` ( // column names.... ) ENGINE=MyISAM AUTO_INCREMENT=301 DEFAULT CHARSET=latin1; 

change

 DEFAULT CHARSET=latin1; 

to

 DEFAULT CHARSET=utf8; 

delete the old tables (of course, the backup old tables) and import this new file.

It worked for me. Hope this helps.

+1
source

All Articles