Looking at the EXPLAIN query EXPLAIN , how can I determine which optimizations are best done?
I appreciate that one of the first things to check is whether good indexes are used, but other than that, I'm a bit of a dead end. Through trial and error in the past, I sometimes found that the order in which the associations are carried out can be a good source of improvement, but how can you determine what to see from the execution plan?
While I really would like to get a good general idea of how to optimize queries (the proposed reading is very much appreciated!), I also understand that it is often easier to discuss specific cases than to speak abstractly. Since I'm currently banging my head against the wall with this, your thoughts will be very grateful:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE S const PRIMARY, l, p, f4 PRIMARY 2 const 1 Using temporary
1 SIMPLE Q ref PRIMARY, SS 2 const 204 Using index
1 SIMPLE V ref PRIMARY, n, QQ 5 const, db.Q.QID 6 Using where; Using index; Distinct
1 SIMPLE R1 ref PRIMARY, LL 154 const, db.V.VID 447 Using index; Distinct
1 SIMPLE W eq_ref PRIMARY, w PRIMARY 5 const, db.R.RID, const 1 Using where; Distinct
1 SIMPLE R2 eq_ref PRIMARY, L PRIMARY 156 const, db.W.RID, const 1 Using where; Distinct
I correctly interpret the final line of the execution plan as follows:
- since it completely matches its primary key, only one
R2 row is required for each output line; - however, such output strings are then filtered based on some criteria that apply to
R2 ?
If so, my problem is the filtering that occurs at this final stage. If the condition does not lead to filtering (for example, WHERE `Col_1_to_3` IN (1,2,3) ), the query is executed very quickly (~ 50 ms); however, if the condition restricts the selected rows ( WHERE `Col_1_to_3` IN (1,2) ), the query takes significantly longer (~ 5 s). If the restriction is on one match ( WHERE `Col_1_to_3` IN (1) ), the optimizer offers a completely different execution plan (which works a little better than 5 s, but still much worse than 50 ms). It seems that there is no better index that can be used in this table (if it already makes full use of the primary key to return one row to the result?).
How to interpret all this information? Do I have the right to guess that, since such filtering of the output occurs in the final table, which must be combined, considerable efforts are wasted against joining the table earlier and filtering such rows earlier? If so, how do you determine when R2 should be combined into an execution plan?
While I resisted, including the request and the schema, all the way here (since I really could know what to look for, and not just say the answer), I understand that it is necessary to discuss in advance:
SELECT DISTINCT `Q`.`QID` FROM `S` NATURAL JOIN `Q` NATURAL JOIN `V` NATURAL JOIN `R` AS `R1` NATURAL JOIN `W` JOIN `R` AS `R2` ON ( `R2`.`SID` = `S`.`SID` AND `R2`.`RID` = `R1`.`RID` AND `R2`.`VID` = `S`.`V_id` AND `R2`.`Col_1_to_3` IN (1,2)
Definition of table R :
CREATE TABLE `R` ( `SID` smallint(6) unsigned NOT NULL, `RID` smallint(6) unsigned NOT NULL, `VID` varchar(50) NOT NULL DEFAULT '', `Col_1_to_3` smallint(1) DEFAULT NULL, `T` varchar(255) DEFAULT NULL, PRIMARY KEY (`SID`,`RID`,`VID`), KEY `L` (`SID`,`VID`,`Col_1_to_3`), CONSTRAINT `R_f1` FOREIGN KEY (`SID`) REFERENCES `S` (`SID`), CONSTRAINT `R_f2` FOREIGN KEY (`SID`, `VID`) REFERENCES `V` (`SID`, `VID`), CONSTRAINT `R_f3` FOREIGN KEY (`SID`, `VID`, `Col_1_to_3`) REFERENCES `L` (`SID`, `VID`, `LID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8