look at this thread (and use a search first before posting a question).
in your case, I think you forgot to set chorrect charset for your database connection (using the SET NAMES statement or mysql_set_charset () ) - but it's hard to say.
This is a quote from chazomaticus , which gave an excellent answer in the stream you like, listing all the points you should take care of:
storage:
- Specify
utf8_unicode_ci (or equivalent) sorting across all tables and text columns in your database. This makes MySQL physically retrieve values natively in UTF-8.
indexing:
- In PHP, in any DB shell you use, you need to establish a charset to utf8 connection. Thus, MySQL does no conversion from its native UTF-8 when it passes data to PHP. * Note: if you are not using a wrapper database, you will probably have to issue a query to tell MySQL to give you leads to UTF-8:
SET NAMES 'utf8' (as soon as you connect).
Delivery:
- You must tell PHP to deliver the correct headers for the client, so the text will be interpreted as UTF-8. In PHP, you can use
default_charset php.ini or manually issue a Content-Type header, which is just more work, but has the same effect.
Representation:
- You want all the data sent to you by browsers to be in UTF-8. Unfortunately, the only way to do this reliably is to add
accept-charset for all your <form> tags: <form ... accept-charset="UTF-8"> . - Note that the W3C HTML specification says that clients "must" by default send calls to the server in whatever server served charset, but this is apparently just a recommendation, therefore, every
<form> tag is needed. - Although on this front, you still want to check each line sent as valid UTF-8 before trying to store it or use it anywhere. PHP
mb_check_encoding() does the trick, but you have to use it religiously.
Treatment:
- This, unfortunately, is the hard part. You have to make sure that every time you process a UTF-8 string, you do it safely. The easiest way to do this is by using the
mbstring PHP Extension extensively. - PHP String Operations NOT default UTF-8 is safe. There are some things that you can safely perform a regular PHP string operation (for example, concatenation), but for most things you should use the equivalent
mbstring function. - To know what you are doing (read: don't mess it up), you really need to know UTF-8 and how it works at the lowest level possible. Check out any of the links from utf8.com for some resources to find out everything you need to know.
- In addition, I feel like it should be said somewhere, although this may seem obvious: every PHP or HTML file that you serve must be encoded in valid UTF-8.
note that you do not need to use utf-8 - the important part is to use the same encoding everywhere , no matter what encoding it may be. but if you still need to change something, use utf-8.
source share