We have a bunch of database data that someone manually entered. They contain many characters of the British pound (£). The original user was copying / pasting the pound sign from somewhere, not sure where (I'm not sure if this matters or not ...).
In any case, when printing data on a PHP page, the pound signs are displayed as a replacement character . There is <meta charset="utf-8"/>
on the page. In the browser, if you change the encoding to ISO-8859-1
, then the pound signs will appear correctly.
After some digging, I came to the conclusion that the original data entry person copied / pasted the ISO-8859-1
encoded pound sign into the database. Therefore, if a page is not displayed using ISO-8859-1
, it will not display correctly.
Here is the header information from Chrome:
Request URL:http://www.mysite.com/test.php Request Method:GET Status Code:200 OK Request Headersview source Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3 Accept-Encoding:gzip,deflate,sdch Accept-Language:en-US,en;q=0.8 Cache-Control:max-age=0 Connection:keep-alive Cookie:X-Mapping-goahf.... Host:www.mysite.com User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2 Response Headersview source Connection:Keep-Alive Content-Type:text/html; charset=UTF-8 Date:Wed, 07 Dec 2011 22:38:14 GMT Server:Apache/2.2 Transfer-Encoding:chunked
The MySQL table also states that it uses latin1_swedish_ci
, which was the default.
So how do I solve this problem? I don’t know much about how character encoding works and what happens when you copy / paste characters from one place to another.
I tried to go to this page:
http://www.fileformat.info/info/unicode/char/a3/browsertest.htm
And having copied the pound symbol and pasted it into the database, thinking that it will fix it, but it didn’t seem to me ... How to make the pound symbol that is in the database instead of the pound symbol UTF-8 ISO-8859-1?