We have two tables similar to the simple structure of tag entries as follows (in fact, it is much more complicated, but this is the essence of the problem):
tag (Aa) | recordId (Ab) 1 | 1 2 | 1 2 | 2 3 | 2 ....
and
recordId (Bb) | recordData (Bc) 1 | 123 2 | 666 3 | 1246
The problem is retrieving ordered records with a specific tag. The obvious way to do this is by simply combining and indexing on (PK) (Aa, Ab), (Ab), (PK) (Bb), (Bb, Bc) as such:
select Aa, Ab, Bc from A join B on Ab = Bb where a = 44 order by c;
However, this gives an unpleasant file result:
+----+-------------+-------+------+---------------+---------+---------+-----------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+---------+---------+-----------+------+----------------------------------------------+ | 1 | SIMPLE | A | ref | PRIMARY,b | PRIMARY | 4 | const | 94 | Using index; Using temporary; Using filesort | | 1 | SIMPLE | B | ref | PRIMARY,b | b | 4 | booli.Ab | 1 | Using index | +----+-------------+-------+------+---------------+---------+---------+-----------+------+----------------------------------------------+
Using a huge and extremely redundant “materialized view”, we can get pretty decent performance, but this is due to the complexity of the business logic, which we would like to avoid, especially since tables A and B are already MV: s (and are necessary for others queries, as well as asking the same queries using UNION).
create temporary table C engine=innodb as (select Aa, Ab, Bc from A join B on Ab = Bb); explain select a, b, c from C where a = 44 order by c;
A further complication of the situation is that we have symbols on the B-table, such as range filters.
select Aa, Ab, Bc from A join B on Ab = Bb where a = 44 AND Bc > 678 order by c;
But we are sure that we can deal with this if the problem with the file phone disappears.
Does anyone know why a simple join in codeblon 3 above will not use an index to sort, and if we somehow get around the problem without creating a new MV?
The following is a complete list of SQL that we use for testing.
DROP TABLE IF EXISTS A; DROP TABLE IF EXISTS B; DROP TABLE IF EXISTS C; CREATE TEMPORARY TABLE A (a INT NOT NULL, b INT NOT NULL, PRIMARY KEY(a, b), INDEX idx_A_b (b)) ENGINE=INNODB; CREATE TEMPORARY TABLE B (b INT NOT NULL, c INT NOT NULL, d VARCHAR(5000) NOT NULL DEFAULT '', PRIMARY KEY(b), INDEX idx_B_c (c), INDEX idx_B_b (b, c)) ENGINE=INNODB; DELIMITER $$ CREATE PROCEDURE prc_filler(cnt INT) BEGIN DECLARE _cnt INT; SET _cnt = 1; WHILE _cnt <= cnt DO INSERT IGNORE INTO A SELECT RAND()*100, RAND()*10000; INSERT IGNORE INTO B SELECT RAND()*10000, RAND()*1000, ''; SET _cnt = _cnt + 1; END WHILE; END $$ DELIMITER ; START TRANSACTION; CALL prc_filler(100000); COMMIT; DROP PROCEDURE prc_filler; CREATE TEMPORARY TABLE C ENGINE=INNODB AS (SELECT Aa, Ab, Bc FROM A JOIN B ON Ab = Bb); ALTER TABLE C ADD (PRIMARY KEY(a, b), INDEX idx_C_a_c (a, c)); EXPLAIN EXTENDED SELECT Aa, Ab, Bc FROM A JOIN B ON Ab = Bb WHERE Aa = 44; EXPLAIN EXTENDED SELECT Aa, Ab, Bc FROM A JOIN B ON Ab = Bb WHERE 1 ORDER BY Bc; EXPLAIN EXTENDED SELECT Aa, Ab, Bc FROM A JOIN B ON Ab = Bb where Aa = 44 ORDER BY Bc; EXPLAIN EXTENDED SELECT a, b, c FROM C WHERE a = 44 ORDER BY c; -- Added after Quassnois comments EXPLAIN EXTENDED SELECT Aa, Ab, Bc FROM B FORCE INDEX (idx_B_c) JOIN A ON Ab = Bb WHERE Aa = 44 ORDER BY Bc; EXPLAIN EXTENDED SELECT Aa, Ab, Bc FROM A JOIN B ON Ab = Bb WHERE Aa = 44 ORDER BY Bc LIMIT 10; EXPLAIN EXTENDED SELECT Aa, Ab, Bc FROM B FORCE INDEX (idx_B_c) JOIN A ON Ab = Bb WHERE Aa = 44 ORDER BY Bc LIMIT 10;