Replace Unicode characters in PostgreSQL

Question

Replace Unicode characters in PostgreSQL

Is it possible to replace all occurrences of a given character (expressed in unicode) with another character (expressed in unicode) in the varchar field in PostgreSQL?

I tried something like this:

UPDATE mytable SET myfield = regexp_replace(myfield, '\u0050', '\u0060', 'g')

But it seems that he really writes the string '\ u0060' in the field, and not the character corresponding to this code.

+6

sql-update replace postgresql unicode

user1923631 Mar 03 '13 at 19:51

source share

2 answers

mvp · Answer 1 · 2013-03-03T19:58:26+0000

According to the PostgreSQL lexical documentation , you should use the U& syntax:

 UPDATE mytable SET myfield = regexp_replace(myfield, U&'\0050', U&'\0060', 'g')

You can also use the escape line form specific to PostgreSQL E'\u0050' . This will work in older versions than in the unicode escape form, but the unicode deletion form is preferred for newer versions. This should show what happens:

 regress=> SELECT '\u0050', E'\u0050', U&'\0050'; ?column? | ?column? | ?column? ----------+----------+---------- \u0050 | P | P (1 row)

Erwin brandstetter · Answer 2 · 2013-03-03T20:22:02+0000

It should work with “characters matching this code,” unless the client or another layer in the product chain fails your code!

Alternatively, use translate() or replace() for this simple job. Much faster than regexp_replace() . translate() also good for a few simple replacements at a time.
And avoid empty updates with a WHERE . Much faster and avoids the table boat and the extra cost of VACUUM .

 UPDATE mytable SET myfield = translate(myfield, 'P', '`') -- actual characters WHERE myfield <> translate(myfield, 'P', '`');

If you continue to encounter problems, use @mvp encoding, provided:

 UPDATE mytable SET myfield = translate(myfield, U&'\0050', U&'\0060') WHERE myfield <> translate(myfield, U&'\0050', U&'\0060');

Replace Unicode characters in PostgreSQL

More articles: