I have a table with the temperature for each day (a huge table) and a table with the dates of the beginning and end of the period (a small table). Now I want to know the average temperature for each period, but the request takes a lot of time. Can it be improved?
NOTE. a long response time disappears after upgrading to version 5.6.19-1 ~ exp1ubuntu2 and may be caused by an error in MySQL versions up to 5.6.8 (see Quassnoi comment)
To rebuild tables of days and periods with random data:
create table days (
day int not null auto_increment primary key,
temperature float not null);
insert into days values(null,rand()),(null,rand()),
(null,rand()),(null,rand()),(null,rand()),(null,rand()),
(null,rand()),(null,rand());
insert into days select null, d1.temperature
from days d1, days d2, days d3, days d4,
days d5, days d6, days d7;
create table periods(id int not null auto_increment primary key,
first int not null,
last int not null,
index(first) using btree,
index(last) using btree,
index(first,last) using btree);
insert into periods(first,last)
select floor(rand(day)*2000000), floor(rand(day)*2000000 + rand()*10)
from days limit 10;
Listing all daily temperatures for each period is not a problem (returns in 1 ms):
select id, temperature
from periods join days on day >= first and day <= last;
Now that GROUP BY, it's actually pretty slow (~ 1750 ms)
select id, avg(temperature)
from periods join days on day >= first and day <= last group by id;
Replacing <= and> = with BETWEEN speeds it up a bit (~ 1600 ms):
select id, avg(temperature)
from periods join days on day between first and last group by id;
, (1 ):
select id, (select avg(temperature)
from days where day >= first and day <= last) from periods
where id=1;
WHERE 4200 , 420 !
select id,
(select avg(temperature) from days where day >= first and day <= last)
from periods;
- () 10 , , 10 ? ?
EDIT: :
mysql> select @@version;
+-------------------------+
| @@version |
+-------------------------+
| 5.5.41-0ubuntu0.14.04.1 |
+-------------------------+
mysql> explain select id, avg(temperature) from periods join days on day >= first and day <= last group by id;
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | periods | index | first,last,first_2 | first_2 | 8 | NULL | 10 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | days | ALL | PRIMARY | NULL | NULL | NULL | 2097596 | Using where; Using join buffer |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
# ALT1 without GROUP BY
mysql> explain select id, temperature from periods join days on day >= first and day <= last;
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | periods | index | first,last,first_2 | first_2 | 8 | NULL | 10 | Using index |
| 1 | SIMPLE | days | ALL | PRIMARY | NULL | NULL | NULL | 2097596 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+------------------------------------------------+
mysql> explain select id, avg(temperature) from periods join days on day between first and last group by id;
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | periods | index | first,last,first_2 | first_2 | 8 | NULL | 10 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | days | ALL | PRIMARY | NULL | NULL | NULL | 2097596 | Using where; Using join buffer |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
# ALT3
mysql> explain select id, (select avg(temperature) from days where day >= first and day <= last) from periods;
+----+--------------------+---------+-------+---------------+---------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------+-------+---------------+---------+---------+------+---------+-------------+
| 1 | PRIMARY | periods | index | NULL | first_2 | 8 | NULL | 10 | Using index |
| 2 | DEPENDENT SUBQUERY | days | ALL | PRIMARY | NULL | NULL | NULL | 2097596 | Using where |
+----+--------------------+---------+-------+---------------+---------+---------+------+---------+-------------+
mysql> explain select id, (select avg(temperature) from days where day >= first and day <= last) from periods where id = 1;
+----+--------------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | PRIMARY | periods | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 2 | DEPENDENT SUBQUERY | days | range | PRIMARY | PRIMARY | 4 | NULL | 10 | Using where |
+----+--------------------+---------+-------+---------------+---------+---------+-------+------+-------------+
EDIT2: FROM, ( 3 )
mysql> explain select id,avg(temperature) from (select id,temperature from periods join days on day between first and last) as t group by id;
+----+-------------+------------+-------+--------------------+---------+---------+------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+--------------------+---------+---------+------+----------+------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 50 | Using temporary; Using filesort |
| 2 | DERIVED | periods | index | first,last,first_2 | first_2 | 8 | NULL | 10 | Using index |
| 2 | DERIVED | days | range | PRIMARY,day | PRIMARY | 4 | NULL | 5 | Range checked for each record (index map: 0x3) |
+----+-------------+------------+-------+--------------------+---------+---------+------+----------+------------------------------------------------+