I have the following data:
customer +---------------------------------------------------------------------------------------------+ | id | email | firstname | lastname | no_pub | customer_address | |---------------------------------------------------------------------------------------------| | 1 | martin@domain.com | Martin | Scorcese | 0 | 4 | | 2 | robert@domain.com | Robert | De Niro | 0 | 7 | | 3 | bruce@domain ;com | Bruce | Willis | 0 | 10 | +---------------------------------------------------------------------------------------------+ address +------------------------------------------+ | id | city | zipcode | |------------------------------------------| | 4 | Paris | 75001 | | 7 | Marseille | 13000 | | 10 | Bordeaux | 33000 | +------------------------------------------+ sf_geo_cities +------------------------------------------+ | id | region_id | zipcode | |------------------------------------------| | 1 | 1 | 75001 | | 2 | 2 | 13000 | | 2 | 3 | 33000 | +------------------------------------------+ sf_geo_regions +------------------------------------------+ | id | name | zipcode | |------------------------------------------| | 1 | Ile-de-France | 75001 | | 2 | Cote d'Azur | 13000 | | 2 | Gironde | 33000 | +------------------------------------------+
This is an example, it is obvious that in these tables there is much more data (300k + in the client, 400k + in the address) In the end I would like to get the following data:
+---------------------------------------------------------------------------------------------+ | id | email | firstname | lastname | city | region | |---------------------------------------------------------------------------------------------| | 1 | martin@domain.com | Martin | Scorcese | Paris | Ile-de-France | | 2 | robert@domain.com | Robert | De Niro | Marseille | Cote d'Azur | | 3 | bruce@domain ;com | Bruce | Willis | NULL | NULL | +---------------------------------------------------------------------------------------------+
I tried the following SQL query:
SELECT c.id, c.email, c.firstname, c.lastname, gc.name, gr.name FROM customer c LEFT JOIN address ad ON ad.id = c.customer_address JOIN sf_geo_cities gc ON gc.zipcode = ad.zipcode JOIN sf_geo_regions gr ON gr.id = gc.region_id WHERE no_pub = 0
But it takes a lot of time, I have to disable the mysql service. There is probably a problem with concatenating rows with multiplication.
Is there a simple query to fetch expected data?
UPDATE # 1: Here is an EXPLAIN example in the comment below:
id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE c ref PRIMARY,UNIQ_81398E097D3656A4,UNIQ_81398E09E7927C7... no_pub 1 const 136220 Using where 1 SIMPLE a eq_ref PRIMARY PRIMARY 8 evotest.c.account 1 Using index 1 SIMPLE ad eq_ref PRIMARY PRIMARY 8 evotest.c.customer_address 1 NULL 1 SIMPLE gc ALL zipcode NULL NULL NULL 38194 Range checked for each record (index map: 0x8) 1 SIMPLE gr eq_ref PRIMARY PRIMARY 4 evotest.gc.region_id 1 NULL
UPDATE # 2: Example Database Here is my database with minimal sample data.
-- -- Structure de la table `account` -- CREATE TABLE IF NOT EXISTS `account` ( `id` bigint(20) unsigned NOT NULL, `identifier` varchar(255) COLLATE utf8_bin NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=315688 ; -- -- Structure de la table `customer` -- CREATE TABLE IF NOT EXISTS `customer` ( `id` bigint(20) unsigned NOT NULL, `account` bigint(20) unsigned NOT NULL, `customer_address` bigint(20) unsigned DEFAULT NULL, `email` varchar(255) COLLATE utf8_bin NOT NULL, `lastname` varchar(255) COLLATE utf8_bin NOT NULL, `firstname` varchar(255) COLLATE utf8_bin NOT NULL, `no_pub` tinyint(1) NOT NULL DEFAULT '0' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=315224 ; -- -- Structure de la table `address` -- CREATE TABLE IF NOT EXISTS `address` ( `id` bigint(20) unsigned NOT NULL, `city` varchar(64) COLLATE utf8_bin DEFAULT NULL, `street` varchar(255) COLLATE utf8_bin DEFAULT NULL, `complement` varchar(128) COLLATE utf8_bin DEFAULT NULL, `zipcode` varchar(16) COLLATE utf8_bin DEFAULT NULL, `country_id` int(11) DEFAULT NULL, `cedex` tinyint(1) NOT NULL DEFAULT '0', `abroad` tinyint(1) NOT NULL DEFAULT '0' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=442743 ; -- -- Structure de la table `sf_geo_cities` -- CREATE TABLE IF NOT EXISTS `sf_geo_cities` ( `id` int(11) NOT NULL, `region_id` int(11) NOT NULL, `name` varchar(255) NOT NULL, `slug` varchar(255) NOT NULL, `zipcode` varchar(5) NOT NULL, `insee_code` int(11) NOT NULL, `latitude` float NOT NULL, `longitude` float NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=38106 ; CREATE TABLE IF NOT EXISTS `sf_geo_regions` ( `id` int(11) NOT NULL, `name` varchar(255) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=25 ; CREATE TABLE IF NOT EXISTS `sf_geo_countries` ( `id` int(11) NOT NULL, `code` int(11) NOT NULL, `alpha2` varchar(2) NOT NULL, `alpha3` varchar(3) NOT NULL, `name_en` varchar(45) NOT NULL, `name_fr` varchar(45) NOT NULL, `is_default` tinyint(1) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=242 ; CREATE TABLE IF NOT EXISTS `sf_user_data` ( `id` int(11) NOT NULL, `user_id` bigint(20) unsigned NOT NULL, `main_activity_type_id` int(11) DEFAULT NULL, `main_activity_id` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=19001 ; CREATE TABLE IF NOT EXISTS `sf_activity_types` ( `id` int(11) NOT NULL, `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `identifier` varchar(255) COLLATE utf8_unicode_ci NOT NULL COMMENT 'Ne doit pas être modifié, il s agit de la clé dans le tableau de constantes $constants stockant les taux dans l entité Calculator' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=3 ; CREATE TABLE IF NOT EXISTS `sf_activities` ( `id` int(11) NOT NULL, `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `asks_for_custom` tinyint(1) NOT NULL DEFAULT '0' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=27 ; -- -- Index pour les tables exportées -- ALTER TABLE `account` ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `identifier_UNIQUE` (`identifier`); ALTER TABLE `customer` ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `UNIQ_81398E097D3656A4` (`account`), ADD UNIQUE KEY `UNIQ_81398E09E7927C74` (`email`), ADD UNIQUE KEY `UNIQ_81398E091193CB3F` (`customer_address`), ADD KEY `no_pub` (`no_pub`); ALTER TABLE `address` ADD PRIMARY KEY (`id`), ADD KEY `IDX_D4E6F81F92F3E70` (`country_id`), ADD KEY `zipcode` (`zipcode`); ALTER TABLE `sf_geo_cities` ADD PRIMARY KEY (`id`), ADD KEY `IDX_B56556A198260155` (`region_id`), ADD KEY `zipcode` (`zipcode`); ALTER TABLE `sf_geo_regions` ADD PRIMARY KEY (`id`); ALTER TABLE `sf_geo_countries` ADD PRIMARY KEY (`id`), ADD KEY `IDX_F86325E277153098` (`code`), ADD KEY `IDX_F86325E2B762D672` (`alpha2`), ADD KEY `IDX_F86325E2C065E6E4` (`alpha3`); ALTER TABLE `sf_user_data` ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `UNIQ_E904BFD1A76ED395` (`user_id`), ADD KEY `IDX_E904BFD12E864BE8` (`main_activity_type_id`), ADD KEY `IDX_E904BFD15543A800` (`main_activity_id`); ALTER TABLE `sf_activity_types` ADD PRIMARY KEY (`id`); ALTER TABLE `sf_activities` ADD PRIMARY KEY (`id`); -- -- Contraintes pour les tables exportées -- ALTER TABLE `customer` ADD CONSTRAINT `FK_81398E091193CB3F` FOREIGN KEY (`customer_address`) REFERENCES `address` (`id`), ADD CONSTRAINT `FK_81398E097D3656A4` FOREIGN KEY (`account`) REFERENCES `account` (`id`); ALTER TABLE `address` ADD CONSTRAINT `FK_D4E6F81F92F3E70` FOREIGN KEY (`country_id`) REFERENCES `sf_geo_countries` (`id`); ALTER TABLE `sf_geo_cities` ADD CONSTRAINT `FK_B56556A198260155` FOREIGN KEY (`region_id`) REFERENCES `sf_geo_regions` (`id`); ALTER TABLE `sf_user_data` ADD CONSTRAINT `FK_E904BFD12E864BE8` FOREIGN KEY (`main_activity_type_id`) REFERENCES `sf_activity_types` (`id`), ADD CONSTRAINT `FK_E904BFD15543A800` FOREIGN KEY (`main_activity_id`) REFERENCES `sf_activities` (`id`), ADD CONSTRAINT `FK_E904BFD1A76ED395` FOREIGN KEY (`user_id`) REFERENCES `account` (`id`); INSERT INTO `account` (`id`, `identifier`) VALUES ('1', ' martin@domain.com '), ('2', ' robert@domain.com '), ('3', ' bruce@domain.com '); INSERT INTO `sf_geo_countries` (`id`, `code`, `alpha2`, `alpha3`, `name_en`, `name_fr`, `is_default`) VALUES ('1', '1', 'FR', 'FRA', 'France', 'France', '1'); INSERT INTO `address` (`id`, `city`, `street`, `complement`, `zipcode`, `country_id`, `cedex`, `abroad`) VALUES ('1', 'Paris', '1 rue de Paris', NULL, '75001', '1', '0', '0'), ('2', 'Marseille', '1 rue de Marseille', NULL, '13000', '1', '0', '0'); INSERT INTO `customer` (`id`, `account`, `customer_address`, `email`, `lastname`, `firstname`, `no_pub`) VALUES ('1', '1', '1', ' martin@domain.com ', 'Scorcese', 'Martin', '0'), ('2', '2', '2', ' robert@domain.com ', 'De Niro', 'Robert', '0'), ('3', '3', NULL, ' bruce@domain.com ', 'Willis', 'Bruce', '0'); INSERT INTO `sf_activities` (`id`, `name`, `asks_for_custom`) VALUES ('1', 'Activity #1', '0'), ('2', 'Activity #2', '0'); INSERT INTO `sf_activity_types` (`id`, `name`, `identifier`) VALUES ('1', 'Activity Type #1', 'activity-type-1'), ('2', 'Activity Type #2', 'activity-type-2'); INSERT INTO `sf_geo_regions` (`id`, `name`) VALUES ('1', 'Ile-de-France'), ('2', 'Cote d''Azur'); INSERT INTO `sf_geo_cities` (`id`, `region_id`, `name`, `slug`, `zipcode`, `insee_code`, `latitude`, `longitude`) VALUES ('1', '1', 'Paris', 'paris', '75001', '1', '0', '0'), ('2', '2', 'Marseille', 'marseille', '13000', '2', '0', '0'); INSERT INTO `sf_user_data` (`id`, `user_id`, `main_activity_type_id`, `main_activity_id`) VALUES ('1', '1', '1', '1'), ('2', '3', '2', '2');
With this minimum amount of data, I can run the following query, which returns to me everything that I want. But this query has been going on for centuries in my real database with 300k + clients and 400k + addresses.
SELECT c.id, c.email, c.firstname, c.lastname, acttypes.name AS activity_type, act.name AS activity, gc.name AS city, gr.name AS region FROM customer c JOIN account a ON a.id = c.account LEFT JOIN sf_user_data ud ON ud.user_id = a.id LEFT JOIN sf_activity_types acttypes ON acttypes.id = ud.main_activity_type_id LEFT JOIN sf_activities act ON act.id = ud.main_activity_id LEFT JOIN address ad ON ad.id = c.customer_address LEFT JOIN sf_geo_cities gc ON gc.zipcode = ad.zipcode LEFT JOIN sf_geo_regions gr ON gr.id = gc.region_id WHERE no_pub = 0
UPDATE # 3: tracking link issue
This query is fast:
SELECT c.id ,c.email ,c.firstname ,c.lastname ,acttypes.name AS activity_type ,act.name AS activity ,ad.zipcode AS address_zipcode
This query also works fast:
SELECT c.id ,c.email ,c.firstname ,c.lastname -- ,acttypes.name AS activity_type -- ,act.name AS activity ,ad.zipcode AS address_zipcode ,gc.name AS city -- ,gr.name AS region FROM customer c JOIN account a ON a.id = c.account -- LEFT JOIN sf_user_data ud ON ud.user_id = a.id -- LEFT JOIN sf_activity_types acttypes ON acttypes.id = ud.main_activity_type_id -- LEFT JOIN sf_activities act ON act.id = ud.main_activity_id LEFT JOIN address ad ON ad.id = c.customer_address LEFT JOIN sf_geo_cities gc ON gc.zipcode = ad.zipcode -- LEFT JOIN sf_geo_regions gr ON gr.id = gc.region_id WHERE no_pub = 0
This query runs forever:
SELECT c.id ,c.email ,c.firstname ,c.lastname ,acttypes.name AS activity_type ,act.name AS activity ,ad.zipcode AS address_zipcode ,gc.name AS city
Matches on actions are OK, joining addresses is OK, but trying to use all these connections together doesn't work.