Optimizing mysql query with two joins and group clauses

I have a query that takes 10-20 seconds, but I'm sure it can be optimized, I'm just not good enough to do this. I would like to help and explain so that I can apply it to similar requests. Here is my request:

SELECT `store_formats`.`Store Nbr`, `store_formats`.`Store Name`, `store_formats`.`Format Name`, `eds_sales`.`Date`, sum(`eds_sales`.`EPOS Sales`) AS Sales, sum(`eds_sales`.`EPOS Quantity`) AS Quantity FROM `eds_sales` INNER JOIN `item_codes` ON `eds_sales`.`Prime Item Nbr` = `item_codes`.`Customer Item` INNER JOIN `store_formats` ON `eds_sales`.`Store Nbr` = `store_formats`.`Store Nbr` WHERE `eds_sales`.`Store Nbr` IN ($storenbr) AND `eds_sales`.`Date` BETWEEN '$startdate' AND '$enddate' AND `eds_sales`.`Client` = '$customer' AND `eds_sales`.`Retailer` IN ($retailer) AND `store_formats`.`Format Name` IN ($storeformat) AND `item_codes`.`Item Number` IN ($products) GROUP BY `store_formats`.`Store Name`, `store_formats`.`Store Nbr`, `store_formats`.`Format Name`, `eds_sales`.`Date` 

Here is the conclusion of the explanation: enter image description here

As you will see there, I tried and created several indexes with columns having little success. The main delay is caused by copying to a temporary table, I think.

These are the tables used:

store_formats:

 CREATE TABLE `store_formats` ( `id` int(12) NOT NULL, `Store Nbr` smallint(5) UNSIGNED DEFAULT NULL, `Store Name` varchar(27) DEFAULT NULL, `City` varchar(19) DEFAULT NULL, `Post Code` varchar(9) DEFAULT NULL, `Region #` int(2) DEFAULT NULL, `Region Name` varchar(10) DEFAULT NULL, `Distr #` int(3) DEFAULT NULL, `Dist Name` varchar(26) DEFAULT NULL, `Square Footage` varchar(7) DEFAULT NULL, `Format` int(1) DEFAULT NULL, `Format Name` varchar(23) DEFAULT NULL, `Store Type` varchar(20) DEFAULT NULL, `TV Region` varchar(12) DEFAULT NULL, `Pharmacy` varchar(3) DEFAULT NULL, `Optician` varchar(3) DEFAULT NULL, `Home Shopping` varchar(3) DEFAULT NULL, `Retailer` varchar(15) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; ALTER TABLE `store_formats` ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `uniqness` (`Store Nbr`,`Store Name`,`Format`), ADD KEY `Store Nbr_2` (`Store Nbr`,`Format Name`,`Store Name`); 

eds_sales:

 CREATE TABLE `eds_sales` ( `id` int(12) UNSIGNED NOT NULL, `Prime Item Nbr` mediumint(7) NOT NULL, `Prime Item Desc` varchar(255) NOT NULL, `Prime Size Desc` varchar(255) NOT NULL, `Variety` varchar(255) NOT NULL, `WHPK Qty` int(5) NOT NULL, `SUPPK Qty` int(5) NOT NULL, `Depot Nbr` int(5) NOT NULL, `Depot Name` varchar(50) NOT NULL, `Store Nbr` smallint(5) UNSIGNED NOT NULL, `Store Name` varchar(255) NOT NULL, `EPOS Quantity` smallint(3) NOT NULL, `EPOS Sales` decimal(13,2) NOT NULL, `Date` date NOT NULL, `Client` varchar(10) NOT NULL, `Retailer` varchar(50) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1; ALTER TABLE `eds_sales` ADD UNIQUE KEY `uniqness` (`Prime Item Nbr`,`Prime Item Desc`,`Prime Size Desc`,`Variety`,`WHPK Qty`,`SUPPK Qty`,`Depot Nbr`,`Depot Name`,`Store Nbr`,`Store Name`,`Date`,`Client`) USING BTREE, ADD KEY `Store Nbr` (`Store Nbr`), ADD KEY `Prime Item Nbr_2` (`Prime Item Nbr`,`Date`), ADD KEY `id` (`id`) USING BTREE, ADD KEY `Store Nbr_2` (`Prime Item Nbr`,`Store Nbr`,`Date`,`Client`,`Retailer`) USING BTREE, ADD KEY `Client` (`Client`,`Store Nbr`,`Date`), ADD KEY `Date` (`Date`,`Client`,`Retailer`); 

item_codes:

 CREATE TABLE `item_codes` ( `id` int(12) NOT NULL, `Item Number` varchar(30) CHARACTER SET latin1 NOT NULL, `Customer Item` mediumint(7) NOT NULL, `Description` varchar(255) CHARACTER SET latin1 NOT NULL, `Status` varchar(15) CHARACTER SET latin1 NOT NULL, `Customer` varchar(30) CHARACTER SET latin1 NOT NULL, `Sort Name` varchar(255) CHARACTER SET latin1 NOT NULL, `EquidataCustomer` varchar(30) CHARACTER SET latin1 NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; ALTER TABLE `item_codes` ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `uniq` (`Item Number`,`Customer Item`,`Customer`,`EquidataCustomer`), ADD KEY `Item Number_2` (`Item Number`,`Sort Name`,`EquidataCustomer`), ADD KEY `Customer Item` (`Customer Item`,`Item Number`,`Sort Name`,`EquidataCustomer`), ADD KEY `Customer Item_2` (`Customer Item`,`Item Number`,`EquidataCustomer`); 

So my questions are: As you can see, I am joining 3 tables and I am looking for sales by date in a store format. I tried different associations or, for example, instead of combining sales with item_codes and store_formats, attaching store_formats to others, but with the same results. I also pass some arrays of variables using IN, as they are served using select-boxes in the application.

  • The best way to join these tables
  • Suggest the best indexes for the table
  • Why will I get temporary tables? is it because of the band? is there a workaround?
  • If there is a need for temp tables, is there a way to speed up the creation? (I already have a data folder in an 8-disk raid, but still slow.
  • Of course, any suggested alternatives are welcome.

UPDATE: updated my tables with some suggestions from comments

UPDATE: Modified my.cnf, as shown below, improves performance (my RAM - 8 GB, 2 cores, / data / tmp - on the 8th level raid, as well as data)

 tmpdir = /dev/shm/:/data/tmp:/tmp lc-messages-dir = /usr/share/mysql skip-external-locking expire_logs_days = 10 max_binlog_size = 100M innodb_buffer_pool_size = 6G innodb_buffer_pool_instances = 6 query_cache_type=1 
+5
source share
3 answers

(Too many to add a comment, please excuse me for using the answer.)

If you have INDEX(a) and INDEX(a,b) , the first is redundant and needs to be deleted. I see about 5 such cases.

store_nbr each store_nbr have exactly one store_name ? If so, it is redundant to have store_name in multiple tables. I do not know the intent of store_formats , but I think this is one table for placing store_name . Note that there is an inconsistent size for the data types of the two columns store_name and columns store_nbr !

It seems that each store should have a unique number, if so, then ADD UNIQUE KEY uniqness ( Store Nbr , Store Name ) should probably be turned into a PRIMARY KEY(store_nbr) . (Sorry, I will not put spaces in the column names.)

It is rarely useful to start an index with a date, so get rid of KEY Date_2 ( Date , Client ). Instead, add INDEX(Client, store_nbr, Date) ; which should have a direct impact on query speed. You will probably see a change in EXPLAIN SELECT...

int(4) - maybe you mean SMALLINT UNSIGNED ?

The presence of a Date in a UNIQUE (or PRIMARY ) key is usually "incorrect." What kind of "Client" made two purchases of the same thing on the same day?

After you make these changes, let's talk again.

To ensure consistent viewing, specify SHOW CREATE TABLE .

Avoid this construct:

 FROM ( SELECT ... ) JOIN ( SELECT ... ) ON ... 

This is inefficient because no subquery has an index for an effective JOIN .

+2
source

Moved selects into subqueries to minimize elements for joining. I believe MySQL would do this for you already. I would check the implementation plan for this information.

 SELECT stores.nbr, stores.name, stores.format, epos.date, sum(epos.sales) AS Sales, sum(epos.qty) AS Quantity FROM (SELECT `Date` as `date`, `EPOS Sales` as sales,`EPOS Quantity` as qty, `Prime Item Nbr` as item_number, `Store Nbr` as store_number FROM `eds_sales` WHERE `eds_sales`.`Store Nbr` IN ($storenbr) AND `eds_sales`.`Date` BETWEEN '$startdate' AND '$enddate' AND `eds_sales`.`Client` = '$customer' AND `eds_sales`.`Retailer` IN ($retailer)) as epos INNER JOIN (SELECT `Customer Item` as custItem FROM `item_codes` WHERE `item_codes`.`Item Number` IN ($products)) as items ON epos.item_number = items.custItem INNER JOIN (SELECT `Store Nbr` as nbr, `Store Name` as name, `Format Name` as format FROM `store_formats` WHERE `store_formats`.`Format Name` IN ($storeformat)) as stores ON epos.store_number = stores.nbr GROUP BY stores.name, stores.nbr, stores.format, epos.date 
+1
source

Move conditions for joined tables from the WHERE to join ON WHERE :

 SELECT `store_formats`.`Store Nbr`, `store_formats`.`Store Name`, `store_formats`.`Format Name`, `eds_sales`.`Date`, sum(`eds_sales`.`EPOS Sales`) AS Sales, sum(`eds_sales`.`EPOS Quantity`) AS Quantity FROM `eds_sales` JOIN `item_codes` ON `eds_sales`.`Prime Item Nbr` = `item_codes`.`Customer Item` AND `item_codes`.`Item Number` IN ($products) JOIN `store_formats` ON `eds_sales`.`Store Nbr` = `store_formats`.`Store Nbr` AND `store_formats`.`Format Name` IN ($storeformat) WHERE `eds_sales`.`Store Nbr` IN ($storenbr) AND `eds_sales`.`Date` BETWEEN '$startdate' AND '$enddate' AND `eds_sales`.`Client` = '$customer' AND `eds_sales`.`Retailer` IN ($retailer) GROUP BY `store_formats`.`Store Name`, `store_formats`.`Store Nbr`, `store_formats`.`Format Name`, `eds_sales`.`Date` 

Create the following indexes:

 CREATE INDEX IDX001 ON eds_sales (Client,`Store Nbr`,`Retailer`,`Date`); CREATE INDEX IDX002 ON store_formats (`Store Nbr`,`Format Name`); 

If this works, let me know and I will explain why.

0
source

All Articles