MySQL 500 million table rows in select query with join

I am worried about the performance of the query below when the tables are full. While it is being developed and works well with dummy data.

The table "adress_zoo" will contain about 500 million records after full completion. The table "adress_zoo" looks like this:

CREATE TABLE `adress_zoo` ( `adress_id` int(11) NOT NULL, `zoo_id` int(11) NOT NULL, UNIQUE KEY `pk` (`adress_id`,`zoo_id`), KEY `adress_id` (`adress_id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; 

Other tables will contain a maximum of 500 records.

The full query looks like this:

  SELECT a.* FROM jos_zoo_item AS a JOIN jos_zoo_search_index AS zsi2 ON zsi2.item_id = a.id WHERE a.id IN ( SELECT r.id FROM ( SELECT zi.id AS id, Max(zi.priority) as prio FROM jos_zoo_item AS zi JOIN jos_zoo_search_index AS zsi ON zsi.item_id = zi.id LEFT JOIN jos_zoo_tag AS zt ON zt.item_id = zi.id JOIN jos_zoo_category_item AS zci ON zci.item_id = zi.id **JOIN adress_zoo AS az ON az.zoo_id = zi.id** WHERE 1=1 AND ( (zci.category_id != 0 AND ( zt.name != 'prolong' OR zt.name is NULL)) OR (zci.category_id = 0 AND zt.name = 'prolong') ) AND zi.type = 'telefoni' AND zsi.element_id = '44d3b1fd-40f6-4fd7-9444-7e11643e2cef' AND zsi.value = 'Small' AND zci.category_id > 15 **AND az.adress_id = 5** GROUP BY zci.category_id ) AS r ) AND a.application_id = 6 AND a.access IN (1,1) AND a.state = 1 AND (a.publish_up = '0000-00-00 00:00:00' OR a.publish_up <= '2012-06-07 07:51:26') AND (a.publish_down = '0000-00-00 00:00:00' OR a.publish_down >= '2012-06-07 07:51:26') AND zsi2.element_id = '1c3cd26e-666d-4f8f-a465-b74fffb4cb14' GROUP BY a.id ORDER BY zsi2.value ASC 

A query usually returns about 25 records.

Based on my experience, will this request be acceptable (answer in 3 seconds)? What can I do to optimize this?

As @Jack advised, I ran the query using EXPLAIN and got the following:

MySQL EXPLAIN output

+4
source share
2 answers

This part is an important limiter:

 az.adress_id = 5 

MySQL limits the table to only those records where adress_id matches before joining it with the rest of the statement, so this will depend on how likely it is that the result may be.

Btw, you have a UNIQUE(adress_id, zoo_id) and a separate INDEX . Is there a special reason? Because the first part of the spanning key can be used by MySQL to select with.

It is also important to use EXPLAIN to understand how MySQL "attacks" your query and returns results. See Also: http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html

+2
source

To avoid a subquery, you can try rewriting your query as:

 SELECT a.* FROM jos_zoo_item AS a JOIN jos_zoo_search_index AS zsi2 ON zsi2.item_id = a.id INNER JOIN ( SELECT ** distinct ** r.id FROM ( SELECT zi.id AS id, Max(zi.priority) as prio FROM jos_zoo_item AS zi JOIN jos_zoo_search_index AS zsi ON zsi.item_id = zi.id LEFT JOIN jos_zoo_tag AS zt ON zt.item_id = zi.id JOIN jos_zoo_category_item AS zci ON zci.item_id = zi.id **JOIN adress_zoo AS az ON az.zoo_id = zi.id** WHERE 1=1 AND ( (zci.category_id != 0 AND ( zt.name != 'prolong' OR zt.name is NULL)) OR (zci.category_id = 0 AND zt.name = 'prolong') ) AND zi.type = 'telefoni' AND zsi.element_id = '44d3b1fd-40f6-4fd7-9444-7e11643e2cef' AND zsi.value = 'Small' AND zci.category_id > 15 **AND az.adress_id = 5** GROUP BY zci.category_id ) AS r ) T on a.id = T.id where AND a.application_id = 6 AND a.access IN (1,1) AND a.state = 1 AND (a.publish_up = '0000-00-00 00:00:00' OR a.publish_up <= '2012-06-07 07:51:26') AND (a.publish_down = '0000-00-00 00:00:00' OR a.publish_down >= '2012-06-07 07:51:26') AND zsi2.element_id = '1c3cd26e-666d-4f8f-a465-b74fffb4cb14' GROUP BY a.id ORDER BY zsi2.value ASC 

This approach does not perform a subquery for each row of the candidate. Productivity can only be increased if T calculated in a few milliseconds.

0
source

Source: https://habr.com/ru/post/1416524/


All Articles