Lost connection to MySQL server during query on random simple queries

FINAL UPDATE: We solved this problem by finding a way to achieve our goals without branching. But the cause of the problem was branching.

--- Original Post ---

I use ruby ​​in the rails stack, our mysql server is separate but hosted on the same site as our application servers. (We tried replacing it with another double-spec MySQL server, but there was no improvement.

during business hours we receive several such requests without any specific request.

ActiveRecord::StatementInvalid: Mysql2::Error: Lost connection to MySQL server during query 

most failed requests are really simple, and there seems to be no pattern between one request and another. It all started when I upgraded Rails 4.1 to 4.2.

I am at a loss as to what to try. Our database server takes up less than 5% of CPU time during the day. I get error reports from users who have random interaction because of this, so these are not requests that have been running for several hours or something like that, of course, when they repeat the same work, it works .

Our servers are configured on cloud66.

So, in short: our MySQL server is leaving for some reason, but not due to lack of resources, it is also a completely new server when we migrated from another server, when this problem occurred.

this also happens to me on a local host when developing features, so I don't believe this is a boot problem.

We run the following:

  • ruby 2.2.5
  • rails 4.2.6
  • mysql2 0.4.8

UPDATE: according to the first answer below, I increased our max_connections variable to 500 last night, and confirmed the increase with show global variables like 'max_connections';

My connection is still reset, the first one was reset just a few minutes ago today .... ActiveRecord::StatementInvalid: Mysql2::Error: Lost connection to MySQL server during query

I ran the command select * from information_schema.processlist; and I got 36 lines back. Does this mean that there were 36 connections on my application servers at that moment? or can a process be multiple connections?

UPDATE: I just set net_read_timeout = 60 (that was 30 before), I will see if this helps

UPDATE: This did not help, I'm still looking for a solution ...

Here is my Database.yml with remote credentials.

 production: adapter: mysql2 encoding: utf8 host: localhost database: username: password: port: 3306 reconnect: true 
+2
ruby mysql ruby-on-rails activerecord cloud66
source share
8 answers

The connection to MySQL can be broken in several ways, but I would recommend reviewing Mario Carrion's answer, as he is a very wise answer.

It seems likely that the connection is broken because it is shared with other processes, causing communication protocol errors ...

... this can easily happen if the connection pool is connected to a process, which I believe is in ActiveRecord, which means that the same connection can be "checked" several times simultaneously in different processes.

The solution is that database connections should only be established AFTER the fork statement on the application server.

I'm not sure which server you are using, but if you are using the warmup function, do not do this.

If you use any database calls before the first network request, do not do this.

Any of these actions can potentially initialize the connection pool before fork occurs, which will result in the sharing of the MySQL connection pool between processes, while the system will not lock.

I'm not saying that this is the only possible reason for this problem, as pointed out by @ sloth-jr, there are other options ... but most of them seem less likely according to your description.

Sidenote:

I ran select * from information_schema.processlist; and I have 36 lines left. Does this mean that my application server currently had 36 connections? or can a process be multiple connections?

Each process may contain several compounds. In your case, you can connect up to 500X36 . (see edit)

In general, the number of connections in the pool can often be the same as the number of threads in each process (it should not be less than the number of threads, or competition will slow you down). Sometimes it’s useful to add a few more depending on your application.

EDIT:

I apologize for ignoring the fact that the process counter was referencing MySQL data, not application data.

The process metric you showed is MySQL server data, which apparently uses the thread for the connection I / O scheme . Process data actually counts the active connections , not the actual processes or threads (although they should also translate to the number of threads).

This means that out of a possible 500 connections for each application process (i.e. if you use 8 processes for your application, it will be 8X500 = 4000 allowed connections), your application only opened 36 connections.

+3
source share

This indicates a timeout error. This is usually a shared resource or a connection error.

I would check your MySQL configuration for maximum connections in the MySQL console:

 show global variables like 'max_connections'; 

And make sure that the number of federated connections used by Rails database.yml is less:

 pool: 10 

Note that database.yml reflects the number of connections that will be merged into a single Rails process. If you have multiple processes or other servers, such as Sidekiq, you need to add them together.

If necessary, increase max_connections in the configuration of your MySQL server (my.cnf), assuming that your collection can handle this.

 [mysqld] max_connections = 100 

Please note that other things can also be blocked, for example. open files, but finding connections is a good starting point.

You can also track active requests:

 select * from information_schema.processlist; 

as well as monitoring the slow MySQL log.

One problem may be the long-term update command. If you have a slow command that affects many records (for example, an entire table), this can block even the simplest queries. This means that you can see a random query timeout, but if you check the status of MySQL, the real reason is another long query.

+2
source share

Things you didn't mention, but you should take a look:

  • Do you use a unicorn? If so, reconnect and disconnect in your after_fork and before_fork ?
  • Is reconnect: true in your database.yml configuration?
+1
source share

Well, at first glance it sounds like your web server supports mysql sessions, and sometimes the user starts a timeout. Try disabling keep mysql sessions. It will be pigs, but you use only 5% ...

other tips:

  • Turn on mysql Slow Query Log and take a look.

  • write a short script that pulls and logs a list of mysql processes every minute and cross-checks the log with timeouts

  • look at the pool size in your db connection or install it! http://guides.rubyonrails.org/configuring.html#database-pooling should be equal to the maximum connections that like mysql!

Good luck

+1
source share

Find out if your database is limited to many connections. Since usually a SQL database should have more than one active connection. (Contact your network provider)

0
source share

Could you post some of your requests? The MySQL documentation has this: https://dev.mysql.com/doc/refman/5.7/en/error-lost-connection.html TL; DR:

  • Network problems; any of your mailboxes that renew rental periodically or experience other network connection errors (netstat / ss), firewall timeouts, etc. Don't know how hosts are cloud66 ....
  • Request timed out. This can happen if you have backup copies of commands behind blocking statements (for example, modifies / blocks backups in MyISAM tables). How simple are your inquiries? No Cartesian products in the game? An EXPLAIN request can help.
  • Exceeding MAX_PACKET_SIZE. Do you save images, video content, etc.?

There are many possibilities here, and without additional information it will be difficult to determine this.

First look at mysql_error.log, and then follow your path from the database server back to your application.

0
source share

UPDATE: this did not work.

Here is the solution, special thanks to @Myst for pointing out that branching might cause problems, I had no idea to take a look at this particular code. Because the errors seemed random, because we branched in this way in several places.

It turns out that when I forked the processes, rails used the same database connection for all the forked processes. This created a situation when one of the processes (the parent process?) Interrupted the connection to the database, the remaining process would have its connection interrupted.

The solution was to change this code:

  def recalculate_completion Process.fork do if self.course self.course.user_groups.includes(user:[:events]).each do |ug| ug.recalculate_completion end end end end 

in this code:

  def recalculate_completion ActiveRecord::Base.remove_connection Process.fork do ActiveRecord::Base.establish_connection if self.course self.course.user_groups.includes(user:[:events]).each do |ug| ug.recalculate_completion end end ActiveRecord::Base.remove_connection end ActiveRecord::Base.establish_connection end 

Making this change stopped the errors on our servers, and now everything is working fine. If anyone has any further information on why this works, I would be happy to hear this, as I would like to have a deeper understanding of this.

Edit: it turns out that didn't work either .... we still had disconnected connections, but not so many.

0
source share

If you have query cache enabled, drop it and it should work.

RESET QUERY CACHE;

0
source share

All Articles