PostgreSQL does not support Unicode character database character classes, such as .NET. You get a more standard character class [[:alpha:]] , but it depends on the language and probably won't cover it.
You might just be able to mask ASCII characters that you don't want and allow all non-ASCII characters. for example, something like
[^\s!"#$%&'()*+,\-./:;<=>?\[\\\]^_`~]+
(JavaScript does not have character classes other than ASCII, or even [[:alpha:]] .)
For example, given v_text as a text variable to be sanitized:
-- Allow internationalized text characters and remove undesired characters v_text = regexp_replace( lower(trim(v_text)), '[!"#$%&()*+,./:;<=>?\[\\\]\^_\|~]+', '', 'g' );
bobince
source share