Replacing strange characters in MySQL

Some of my texts appear weird, and I need to replace some characters on it. However, I am having problems with a specific char, the following (javascript code to show the difference between characters):

<script> alert('–'.charCodeAt(0) + ':' + '-'.charCodeAt(0)); </script> 

In MySQL, I tried to apply the following query:

 UPDATE translation SET columnx = REPLACE(columnx, '–', '-'); 

But this affects 0 lines. So the question is: what is the correct request to replace these strange characters with the correct one?

UPDATE

The strange char is displayed as follows (square):

Weird char, displayed as square

In JSON, it is encoded as \u0096 instead of -

+4
source share
2 answers

This does not seem to be an encoding, but a mapping. Mapping determines how MySQL treats "almost equal" characters when it comes to sorting or comparing.

For example, standard iso-8859-15 sorting will handle ΓΌ = u

What you can do is handle your field like bin sorting. Binary sorting does not apply to the same characters.

Choose the correct binary sort

 SELECT CHARACTER_SET_NAME, COLLATION_NAME FROM information_schema.COLLATIONS WHERE COLLATION_NAME LIKE '%bin%'; 

Then do your upgrade as follows:

 UPDATE TABLE SET columnx = REPLACE( columnx COLLATE latin1_bin, '–', '-' ); 

CORRECTION: REPLACE comparisons are always performed using binary sorting.

EDIT:

If you still get 0 lines, you probably won't replace the correct character. Convert the string containing the character to hexadecimal and place the hexadecimal value so that we can find out which char we are talking about

eg.

 SELECT HEX( columnx ) LIMIT 1; 

EDIT2:

You just said that you got \u0096 , which is a control character called the BEGIN OF SECURITY AREA . creates .. in hexadecimal, this is 0xC2 0x96 . In your sample request, you replace a characer called EN DASH

It's hard to replace a control character, just by inserting it, conversions can break it. Instead, you can use UNHEX (hexval) to tell MySQL which character you mean

 UPDATE TABLE SET columnx = REPLACE( columnx UNHEX( 'C296' ), '-' ); 

or in order to make it more understandable (or even more confusing :)), it also skips the "normal" hypen as a hexadecimal value

 UPDATE TABLE SET columnx = REPLACE( columnx UNHEX( 'C296' ), UNHEX( '2D' ) ); 
+2
source

As Alvaro said, you really should try changing your database to the correct character set. Usually the utf-8 character set should be sufficient.

For more information, see here: http://dev.mysql.com/doc/refman/5.0/en/charset-applications.html

If you do not have rights to this, you can take a look: http://dev.mysql.com/doc/refman/5.1/de/charset-convert.html and also https://dba.stackexchange.com/questions/9944/ mysql-transfer-iso-8859-1-to-utf-8

+3
source

All Articles