Google Cloud Sql Second Generation Utf8Mb4 Encoding

We use Google Cloud Sql Second Generation with our AppEngine app. However, today we discovered some problem, we cannot embed emoji characters in our database, because we cannot change some server flags to utf8mb4 character encoding.

We changed character_set_server for utf8mb4, but that wasn't enough

We have to change: character_set_system character_set_client collaction_connection

flags to utf8mb4 as well, but second generation db does not allow the root user to change these flags. What can we do to solve this problem?

Does anyone know about this?

thanks

+7
google-app-engine utf8mb4 google-cloud-sql
source share
5 answers

We had the same problem. Setting character_set_server to utf8mb4 was not enough. We could embed emojis through MySQL Workbench, but not through our application.

In our case, this problem disappeared after we launched a new instance of MySQL 5.7 instead of 5.6. Therefore, my hypothesis is that in 5.7, but not in 5.6, changing the character_set_server flag allows Google Cloud SQL to change those other flags that you mentioned, or some other relevant settings.

Of course, if you are already using 5.7, this does not apply to you.

+1
source share

You need to set character_set_server to utf8mb4 , change the columns you need to utf8mb4 and create a new Cloud SQL 2nd gen instance with a new flag (!!). Basically, setting a flag in an existing instance and just restarting (verified from 5.7) will not be enough (is this an error? I did not find it in the documents). Any encryption-related connection parameters are not needed and must be removed. Sorting will be the standard sorting for utf8mb4 , which is perfect for me (and probably in most cases) without even setting anything.

+1
source share

SHOW CREATE TABLE - this will probably say that the column (s) are CHARACTER SET utf8 . This must be fixed using

 ALTER TABLE tbl CONVERT TO CHARACTER SET utf8mb4 COLLATION utf8mb4_unicode_520_ci; 
0
source share

For me, I found that using AppEngine Console-> SQL and editing character_set_server in utf8mb4 and reloading the database work!

0
source share

I have an old java project with a second database and emoji is working fine without using anything in the connection string. Just two things:

  • set the character_set_server flag to utf8mb4,
  • and create the database using utf8mb4.

(Skip to Finally, if you do not want to read all this.) Now I have this problem in python and nothing works. I have to solve this, so Iโ€™ll write what I found. I tried (this does not work below, this is what I tried):

1 remove the flag to restart the instance, to add the flag to restart again

2 I installed? charset = utf8 in connection string and library returned error: Invalid utf8 character string: 'F09F98'

3 I installed? charset = utf8mb4, and the library wrote the value to the database, but instead of emoji it was ???, Therefore, if the library recognizes utf8mb4 and writes it, the problem is not in connecting to the library, but in the database.

4 I launched

 SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; 'character_set_client', 'utf8' 'character_set_connection', 'utf8' 'character_set_database', 'utf8mb4' 'character_set_filesystem', 'binary' 'character_set_results', 'utf8' 'character_set_server', 'utf8mb4' -> this is set from the Google Console 'character_set_system', 'utf8' 'collation_connection', 'utf8_general_ci' 'collation_database', 'utf8mb4_general_ci' 'collation_server', 'utf8mb4_general_ci' UPDATE comment set body="๐Ÿ˜Ž" where id=1; Invalid utf8 character string: '\xF0\x9F\x98\x8E' 0,045 sec SET NAMES utf8mb4; SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; 'character_set_client', 'utf8mb4' 'character_set_connection', 'utf8mb4' 'character_set_database', 'utf8mb4' 'character_set_filesystem', 'binary' 'character_set_results', 'utf8mb4' 'character_set_server', 'utf8mb4' 'character_set_system', 'utf8' 'collation_connection', 'utf8mb4_general_ci' 'collation_database', 'utf8mb4_general_ci' 'collation_server', 'utf8mb4_general_ci' UPDATE comment set body="๐Ÿ˜Ž" where id=1; SUCCESS 

So the problem is one of these flags.

5 I closed the current connection and reopened my client to set these variables again in utf8. First, I changed character_set_results and character_set_client so that I can see the correct result in my client (MysqlWorkbench). I ran the update instruction again without success and yet ??? in field. After changing character_set_connection in utf8mb4 and updating the field again, this time I had emoji in the table. But why character_set_connection . As the above tests show, the connection from the library is already utf8mb4. Therefore, at the moment I do not understand where to install charset to connect to utf8mb4 so that things can start working.

6 I tried to create a new instance of Cloud SQL with the charset flag and created a database with utf8mb4 and a table with utf8mb4 (although the tables are created with the database encoding by default), and the insert statement does not work again, so the only thing I can come up with is that charset = utf8mb4 does not work on the connection string. But that was not so. I tried to remove the encoding in the connection string and again the same error as before, when using only the utf8 string in the connection string

So what remains, I do not know.

7 I tried to use the instance with the HDD, not the SSD.

8 I tried to connect through the Google Cloud shell and paste the line through the console.

 ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x8E' for column 'body' at row 1 

Interestingly, the cloud shell even shows in โ€œshow create tableโ€ that the default for this table is utf8mb4. Thus, the cloud shell ( Lightbulb ), like mysqlworkbench, connects to utf8 by default

Finally

Everything that worked with db.session.execute ("SET NAMES 'utf8mb4'") before pasting into the database (in python) (and using? Charset = utf8mb4 only locally). The real problem when testing something like this may be the method you use to check the result in the database. MySQL Workbench always associated with the default utf8 encoding (you can verify this with the "SHOW ..." command above). Therefore, first of all, you need to switch the connection to the MySQL Workbench (or on your client) using SET NAMES 'utf8mb4'. The above tests show that the Google cloud shell was also associated with utf8 by default. I searched the Internet and found that they cannot use utf8mb4 by default, because they expect utf8mb4 to become the new standard connection in mysql, and this is what it will be called "utf8". There is also no way to get MySQL Workbench to work with utf8mb4 automatically after connecting. You have to do it yourself. Could there be a problem reading from the database? I'm going to check it out now.

0
source share

All Articles