I have the following tables:
- Personas
- ImpressionsPersonas [join table - Personas ManyToMany Impressions]
- Impression
My query looks like this: EXPLAIN results are attached below:
SELECT DISTINCT (Personas.id), Personas.parent_id, Personas.persona, Personas.subpersonas_count, Personas.is_subpersona, Personas.impressions_count, Personas.created, Personas.modified FROM personas as Personas INNER JOIN impressions_personas ImpressionsPersonas ON ( Personas.id = ImpressionsPersonas.persona_id ) inner JOIN impressions Impressions ON (Impressions.id = ImpressionsPersonas.impression_id AND Impressions.timestamp >= "2016-06-01 00:00:00" AND Impressions.timestamp <= "2016-07-31 00:00:00")
EXPLAIN
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+ | 1 | SIMPLE | Personas | ALL | PRIMARY | NULL | NULL | NULL | 159 | 100.00 | Using temporary | | 1 | SIMPLE | ImpressionsPersonas | ref | impression_idx,persona_idx,comp_imp_persona,comp_imp_pri,comp_per_pri | persona_idx | 8 | gen1_d2go.Personas.id | 396 | 100.00 | Distinct | | 1 | SIMPLE | Impressions | eq_ref | PRIMARY,timestamp,timestamp_id | PRIMARY | 8 | gen1_d2go.ImpressionsPersonas.impression_id | 1 | 100.00 | Using where; Distinct | +----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+ 3 rows in set, 1 warning (0.00 sec)
CREATE STATEMENT FOR STAFF
CREATE TABLE `personas` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `parent_id` bigint(20) unsigned DEFAULT NULL, `persona` varchar(150) NOT NULL, `subpersonas_count` int(10) unsigned DEFAULT '0', `is_subpersona` tinyint(1) unsigned DEFAULT '0', `impressions_count` bigint(20) unsigned DEFAULT '0', `created` datetime DEFAULT NULL, `modified` datetime DEFAULT NULL, PRIMARY KEY (`id`), KEY `lookup` (`parent_id`,`persona`), KEY `parent_index` (`parent_id`), KEY `persona` (`persona`), KEY `persona_a_id` (`id`,`persona`), CONSTRAINT `self_referential_join_to_self` FOREIGN KEY (`parent_id`) REFERENCES `personas` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION ) ENGINE=InnoDB AUTO_INCREMENT=1049 DEFAULT CHARSET=utf8;
CREATE AN APPLICATION FOR IMPRESSIONS_PERSONAS
CREATE TABLE `impressions_personas` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `impression_id` bigint(20) unsigned NOT NULL, `persona_id` bigint(20) unsigned NOT NULL, `created` datetime DEFAULT NULL, `modified` datetime DEFAULT NULL, PRIMARY KEY (`id`), KEY `impression_idx` (`impression_id`), KEY `persona_idx` (`persona_id`), KEY `comp_imp_persona` (`impression_id`,`persona_id`), KEY `comp_imp_pri` (`impression_id`,`id`), KEY `comp_per_pri` (`persona_id`,`id`), CONSTRAINT `impression` FOREIGN KEY (`impression_id`) REFERENCES `impressions` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION, CONSTRAINT `persona` FOREIGN KEY (`persona_id`) REFERENCES `personas` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION ) ENGINE=InnoDB AUTO_INCREMENT=19387839 DEFAULT CHARSET=utf8;
CREATE AN APPLICATION FOR IMPRESSION
CREATE TABLE `impressions` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `device_id` bigint(20) unsigned NOT NULL, `beacon_id` bigint(20) unsigned NOT NULL, `zone_id` bigint(20) unsigned NOT NULL, `application_id` bigint(20) unsigned DEFAULT NULL, `timestamp` datetime NOT NULL, `google_place_id` bigint(20) unsigned DEFAULT NULL, `name` varchar(60) DEFAULT NULL, `lat` decimal(15,10) DEFAULT NULL, `lng` decimal(15,10) DEFAULT NULL, `personas_count` int(10) unsigned DEFAULT '0', `created` datetime DEFAULT NULL, `modified` datetime DEFAULT NULL, PRIMARY KEY (`id`), KEY `device_idx` (`device_id`), KEY `zone_idx` (`zone_id`), KEY `beacon_id_idx2` (`beacon_id`), KEY `timestamp` (`timestamp`), KEY `appid_fk_idx_idx` (`application_id`), KEY `comp_lookup` (`device_id`,`beacon_id`,`timestamp`), KEY `timestamp_id` (`timestamp`,`id`), CONSTRAINT `appid_fk_idx` FOREIGN KEY (`application_id`) REFERENCES `applications` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION, CONSTRAINT `beacon_id` FOREIGN KEY (`beacon_id`) REFERENCES `beacons` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION, CONSTRAINT `device2` FOREIGN KEY (`device_id`) REFERENCES `devices` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION, CONSTRAINT `zone_FK` FOREIGN KEY (`zone_id`) REFERENCES `zones` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION ) ENGINE=InnoDB AUTO_INCREMENT=1582724 DEFAULT CHARSET=utf8;
Now - when I run the query without DISTINCT and using COUNT(*) , it retrieves about 17,000,000 records. Running with DISTINCT gives 112 entries. I'm not sure why so many entries appear when the explanation shows only 159 and 396.
Some information about the tables:
The Personas table contains 159 entries. The ImpressionsPersonas table contains about 12.6 million, and Impressions contains about 920,000 records.
We make a selection of the Personas table and join the show through the ImpressionsPersonas connection table. There are filters that apply to the Impressions table (date in this case).
Note. Removing the date filter had a slight effect on runtime, which ranges around 120 seconds. Is there a way to filter these entries to reduce the execution time of this query?