SQL query optimization

I have an SQL query that takes up 100% of my VM processor while it is running. I want to know how to optimize it:

SELECT g.name AS hostgroup , h.name AS hostname , a.host_id , s.display_name AS servicename , a.service_id , a.entry_time AS ack_time , ( SELECT ctime FROM logs WHERE logs.host_id = a.host_id AND logs.service_id = a.service_id AND logs.ctime < a.entry_time AND logs.status IN (1, 2, 3) AND logs.type = 1 ORDER BY logs.log_id DESC LIMIT 1) AS start_time , ar.acl_res_name AS timeperiod , a.state AS state , a.author , a.acknowledgement_id AS ack_id FROM centstorage.acknowledgements a LEFT JOIN centstorage.hosts h ON a.host_id = h.host_id LEFT JOIN centstorage.services s ON a.service_id = s.service_id LEFT JOIN centstorage.hosts_hostgroups p ON a.host_id = p.host_id LEFT JOIN centstorage.hostgroups g ON g.hostgroup_id = p.hostgroup_id LEFT JOIN centreon.hostgroup_relation hg ON a.host_id = hg.host_host_id LEFT JOIN centreon.acl_resources_hg_relations hh ON hg.hostgroup_hg_id = hh.hg_hg_id LEFT JOIN centreon.acl_resources ar ON hh.acl_res_id = ar.acl_res_id WHERE ar.acl_res_name != 'All Resources' AND YEAR(FROM_UNIXTIME( a.entry_time )) = YEAR(CURDATE()) AND MONTH(FROM_UNIXTIME( a.entry_time )) = MONTH(CURDATE()) AND a.service_id is not null ORDER BY a.acknowledgement_id ASC 

The problem in this part:

 (SELECT ctime FROM logs WHERE logs.host_id = a.host_id AND logs.service_id = a.service_id AND logs.ctime < a.entry_time AND logs.status IN (1, 2, 3) AND logs.type = 1 ORDER BY logs.log_id DESC LIMIT 1) AS start_time 

The table logs are really huge, and some friends told me to use a spool table / database, but I knew pretty well about it, and I don't know how to do it.

There is an EXPLAIN EXTENDED request for the request: Here!

It seems that it will only consider 2 rows of table logs, so why does it take so long? (There are 560,000 rows in the table logs).

Here are all the indices of these tables:

centstorage.updates:

enter image description here centstorage.hosts:

enter image description here centstorage.services:

enter image description here centstorage.hosts_hostgroups:

enter image description here centstorage.hostgroups:

enter image description here centreon.hostgroup_relation:

enter image description here centreon.acl_resources_hg_relations:

enter image description here centreon.acl_resources:

enter image description here

+5
source share
5 answers

For SQL Server is possible to determine the maximum degree of parallelism of your query using MAXDOP

For example, you can specify at the end of your query

 option (maxdop 2) 

I am sure there is an equivalent in MySql .

You can try to get closer to this situation if the runtime does not matter.

0
source
  • Create a temporary table where the condition for confirmations, the schema will have the column required in the final result and used in the JOIN with all your 7 tables

     CREATE TEMPORARY TABLE __tempacknowledgements AS SELECT g.name AS hostgroup , '' AS hostname , a.host_id , s.display_name AS servicename , a.service_id , a.entry_time AS ack_time , '' AS AS start_time , '' AS timeperiod , a.state AS state , a.author , a.acknowledgement_id AS ack_id FROM centstorage.acknowledgements a WHERE YEAR(FROM_UNIXTIME( a.entry_time )) = YEAR(CURDATE()) AND MONTH(FROM_UNIXTIME( a.entry_time )) = MONTH(CURDATE()) AND a.service_id IS NOT NULL ORDER BY a.acknowledgement_id ASC; 

Or create using the correct column definition

  1. To update fields from all tables with a left join, you can use Inner Join in the update. You must write 7 different update statements. The following are 2 examples.

     UPDATE __tempacknowledgements a JOIN centstorage.hosts h USING(host_id) SET a.name=h.name; UPDATE __tempacknowledgements s JOIN centstorage.services h USING(service_id) SET a.acl_res_name=s.acl_res_name; 
  2. Similar to updating ctime from logs using Join with Logs, this is the 8th update request.

  3. select select from temp table.
  4. temporary table

sp can be written for this.

0
source

Turn the LEFT JOIN into a JOIN if you don't have a real need for a LEFT .

 AND YEAR(FROM_UNIXTIME( a.entry_time )) = YEAR(CURDATE()) AND MONTH(FROM_UNIXTIME( a.entry_time )) = MONTH(CURDATE()) AND a.service_id is not null 

Do you have lines with a.service_id is not null ? If not, get rid of it.

As already mentioned, this date comparison is not optimized. Here's what to use instead:

 AND a.entry_time >= CONCAT(LEFT(CURDATE(), 7), '-01') AND a.entry_time < CONCAT(LEFT(CURDATE(), 7), '-01') + INTERVAL 1 MONTH 

And add one of them (depending on my comment):

 INDEX(entry_time) INDEX(service_id, entry_time) 

Correlated subquery is difficult to optimize. This index (on logs ) can help:

 INDEX(type, host_id, service_id, status) 
0
source

WHERE IN - time killer! Instead of logs.status IN (1, 2, 3) use logs.status = 1 or logs.status = 2 or logs.status = 3

0
source

I accidentally reformatted the query for my readability reference and better saw the relationship between the tables ... otherwise I will ignore this part.

 SELECT g.name AS hostgroup, h.name AS hostname, a.host_id, s.display_name AS servicename, a.service_id, a.entry_time AS ack_time, ( SELECT ctime FROM logs WHERE logs.host_id = a.host_id AND logs.service_id = a.service_id AND logs.ctime < a.entry_time AND logs.status IN (1, 2, 3) AND logs.type = 1 ORDER BY logs.log_id DESC LIMIT 1) AS start_time, ar.acl_res_name AS timeperiod, a.state AS state, a.author, a.acknowledgement_id AS ack_id FROM centstorage.acknowledgements a LEFT JOIN centstorage.hosts h ON a.host_id = h.host_id LEFT JOIN centstorage.services s ON a.service_id = s.service_id LEFT JOIN centstorage.hosts_hostgroups p ON a.host_id = p.host_id LEFT JOIN centstorage.hostgroups g ON p.hostgroup_id = g.hostgroup_id LEFT JOIN centreon.hostgroup_relation hg ON a.host_id = hg.host_host_id LEFT JOIN centreon.acl_resources_hg_relations hh ON hg.hostgroup_hg_id = hh.hg_hg_id LEFT JOIN centreon.acl_resources ar ON hh.acl_res_id = ar.acl_res_id WHERE ar.acl_res_name != 'All Resources' AND YEAR(FROM_UNIXTIME( a.entry_time )) = YEAR(CURDATE()) AND MONTH(FROM_UNIXTIME( a.entry_time )) = MONTH(CURDATE()) AND a.service_id is not null ORDER BY a.acknowledgement_id ASC 

First, I recommend starting with your β€œconfirmation” table and having an index at least (entry_time, confirmment_id). Then update the WHERE clause. Since you use the function to convert the unix timestamp to a date and capture YEAR (and month), respectively, I don’t think it uses an index because it has to calculate this for each row. To enhance this, the unix timestamp is nothing more than a number representing seconds from a specific point in time. If you are looking for a specific month, then pre-calculate the start and end times of unix and run for this range. Sort of...

and a.entry_time> = UNIX_TIMESTAMP ('2015-10-01') and a.entry_time <UNIX_TIMESTAMP ('2015-11-01')

Thus, it is all seconds for a month until 11:59:59 on October 31, shortly before November 1.

Then, without glasses, to see all the images more clearly and for a shorter time this morning, I would ensure that you have at least the following indexes on each table, respectively

 table index logs ( host_id, service_id, type, status, ctime, log_id ) acknowledgements ( entry_time, acknowledgement_id, host_id, service_id ) hosts ( host_id, name ) services ( service_id, display_name ) hosts_hostgroups ( host_id, hostgroup_id ) hostgroups ( hostgroup_id, name ) hostgroup_relation ( host_host_id, hostgroup_hg_id ) acl_resources_hg_relations ( hh_hg_id, acl_res_id ) acl_resources ar ( acl_res_id, acl_res_name ) 

Finally, your correlated subquery field will be a killer as it is processed for each row, but hopefully other ideas for optimizing the index will help performance.

0
source

All Articles