Cannot make connections work correctly

I have the following data:

customer +---------------------------------------------------------------------------------------------+ | id | email | firstname | lastname | no_pub | customer_address | |---------------------------------------------------------------------------------------------| | 1 | martin@domain.com | Martin | Scorcese | 0 | 4 | | 2 | robert@domain.com | Robert | De Niro | 0 | 7 | | 3 | bruce@domain ;com | Bruce | Willis | 0 | 10 | +---------------------------------------------------------------------------------------------+ address +------------------------------------------+ | id | city | zipcode | |------------------------------------------| | 4 | Paris | 75001 | | 7 | Marseille | 13000 | | 10 | Bordeaux | 33000 | +------------------------------------------+ sf_geo_cities +------------------------------------------+ | id | region_id | zipcode | |------------------------------------------| | 1 | 1 | 75001 | | 2 | 2 | 13000 | | 2 | 3 | 33000 | +------------------------------------------+ sf_geo_regions +------------------------------------------+ | id | name | zipcode | |------------------------------------------| | 1 | Ile-de-France | 75001 | | 2 | Cote d'Azur | 13000 | | 2 | Gironde | 33000 | +------------------------------------------+ 

This is an example, it is obvious that in these tables there is much more data (300k + in the client, 400k + in the address) In the end I would like to get the following data:

 +---------------------------------------------------------------------------------------------+ | id | email | firstname | lastname | city | region | |---------------------------------------------------------------------------------------------| | 1 | martin@domain.com | Martin | Scorcese | Paris | Ile-de-France | | 2 | robert@domain.com | Robert | De Niro | Marseille | Cote d'Azur | | 3 | bruce@domain ;com | Bruce | Willis | NULL | NULL | +---------------------------------------------------------------------------------------------+ 

I tried the following SQL query:

 SELECT c.id, c.email, c.firstname, c.lastname, gc.name, gr.name FROM customer c LEFT JOIN address ad ON ad.id = c.customer_address JOIN sf_geo_cities gc ON gc.zipcode = ad.zipcode JOIN sf_geo_regions gr ON gr.id = gc.region_id WHERE no_pub = 0 

But it takes a lot of time, I have to disable the mysql service. There is probably a problem with concatenating rows with multiplication.

Is there a simple query to fetch expected data?

UPDATE # 1: Here is an EXPLAIN example in the comment below:

 id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE c ref PRIMARY,UNIQ_81398E097D3656A4,UNIQ_81398E09E7927C7... no_pub 1 const 136220 Using where 1 SIMPLE a eq_ref PRIMARY PRIMARY 8 evotest.c.account 1 Using index 1 SIMPLE ad eq_ref PRIMARY PRIMARY 8 evotest.c.customer_address 1 NULL 1 SIMPLE gc ALL zipcode NULL NULL NULL 38194 Range checked for each record (index map: 0x8) 1 SIMPLE gr eq_ref PRIMARY PRIMARY 4 evotest.gc.region_id 1 NULL 

UPDATE # 2: Example Database Here is my database with minimal sample data.

 -- -- Structure de la table `account` -- CREATE TABLE IF NOT EXISTS `account` ( `id` bigint(20) unsigned NOT NULL, `identifier` varchar(255) COLLATE utf8_bin NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=315688 ; -- -- Structure de la table `customer` -- CREATE TABLE IF NOT EXISTS `customer` ( `id` bigint(20) unsigned NOT NULL, `account` bigint(20) unsigned NOT NULL, `customer_address` bigint(20) unsigned DEFAULT NULL, `email` varchar(255) COLLATE utf8_bin NOT NULL, `lastname` varchar(255) COLLATE utf8_bin NOT NULL, `firstname` varchar(255) COLLATE utf8_bin NOT NULL, `no_pub` tinyint(1) NOT NULL DEFAULT '0' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=315224 ; -- -- Structure de la table `address` -- CREATE TABLE IF NOT EXISTS `address` ( `id` bigint(20) unsigned NOT NULL, `city` varchar(64) COLLATE utf8_bin DEFAULT NULL, `street` varchar(255) COLLATE utf8_bin DEFAULT NULL, `complement` varchar(128) COLLATE utf8_bin DEFAULT NULL, `zipcode` varchar(16) COLLATE utf8_bin DEFAULT NULL, `country_id` int(11) DEFAULT NULL, `cedex` tinyint(1) NOT NULL DEFAULT '0', `abroad` tinyint(1) NOT NULL DEFAULT '0' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=442743 ; -- -- Structure de la table `sf_geo_cities` -- CREATE TABLE IF NOT EXISTS `sf_geo_cities` ( `id` int(11) NOT NULL, `region_id` int(11) NOT NULL, `name` varchar(255) NOT NULL, `slug` varchar(255) NOT NULL, `zipcode` varchar(5) NOT NULL, `insee_code` int(11) NOT NULL, `latitude` float NOT NULL, `longitude` float NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=38106 ; CREATE TABLE IF NOT EXISTS `sf_geo_regions` ( `id` int(11) NOT NULL, `name` varchar(255) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=25 ; CREATE TABLE IF NOT EXISTS `sf_geo_countries` ( `id` int(11) NOT NULL, `code` int(11) NOT NULL, `alpha2` varchar(2) NOT NULL, `alpha3` varchar(3) NOT NULL, `name_en` varchar(45) NOT NULL, `name_fr` varchar(45) NOT NULL, `is_default` tinyint(1) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=242 ; CREATE TABLE IF NOT EXISTS `sf_user_data` ( `id` int(11) NOT NULL, `user_id` bigint(20) unsigned NOT NULL, `main_activity_type_id` int(11) DEFAULT NULL, `main_activity_id` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=19001 ; CREATE TABLE IF NOT EXISTS `sf_activity_types` ( `id` int(11) NOT NULL, `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `identifier` varchar(255) COLLATE utf8_unicode_ci NOT NULL COMMENT 'Ne doit pas être modifié, il s agit de la clé dans le tableau de constantes $constants stockant les taux dans l entité Calculator' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=3 ; CREATE TABLE IF NOT EXISTS `sf_activities` ( `id` int(11) NOT NULL, `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `asks_for_custom` tinyint(1) NOT NULL DEFAULT '0' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=27 ; -- -- Index pour les tables exportées -- ALTER TABLE `account` ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `identifier_UNIQUE` (`identifier`); ALTER TABLE `customer` ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `UNIQ_81398E097D3656A4` (`account`), ADD UNIQUE KEY `UNIQ_81398E09E7927C74` (`email`), ADD UNIQUE KEY `UNIQ_81398E091193CB3F` (`customer_address`), ADD KEY `no_pub` (`no_pub`); ALTER TABLE `address` ADD PRIMARY KEY (`id`), ADD KEY `IDX_D4E6F81F92F3E70` (`country_id`), ADD KEY `zipcode` (`zipcode`); ALTER TABLE `sf_geo_cities` ADD PRIMARY KEY (`id`), ADD KEY `IDX_B56556A198260155` (`region_id`), ADD KEY `zipcode` (`zipcode`); ALTER TABLE `sf_geo_regions` ADD PRIMARY KEY (`id`); ALTER TABLE `sf_geo_countries` ADD PRIMARY KEY (`id`), ADD KEY `IDX_F86325E277153098` (`code`), ADD KEY `IDX_F86325E2B762D672` (`alpha2`), ADD KEY `IDX_F86325E2C065E6E4` (`alpha3`); ALTER TABLE `sf_user_data` ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `UNIQ_E904BFD1A76ED395` (`user_id`), ADD KEY `IDX_E904BFD12E864BE8` (`main_activity_type_id`), ADD KEY `IDX_E904BFD15543A800` (`main_activity_id`); ALTER TABLE `sf_activity_types` ADD PRIMARY KEY (`id`); ALTER TABLE `sf_activities` ADD PRIMARY KEY (`id`); -- -- Contraintes pour les tables exportées -- ALTER TABLE `customer` ADD CONSTRAINT `FK_81398E091193CB3F` FOREIGN KEY (`customer_address`) REFERENCES `address` (`id`), ADD CONSTRAINT `FK_81398E097D3656A4` FOREIGN KEY (`account`) REFERENCES `account` (`id`); ALTER TABLE `address` ADD CONSTRAINT `FK_D4E6F81F92F3E70` FOREIGN KEY (`country_id`) REFERENCES `sf_geo_countries` (`id`); ALTER TABLE `sf_geo_cities` ADD CONSTRAINT `FK_B56556A198260155` FOREIGN KEY (`region_id`) REFERENCES `sf_geo_regions` (`id`); ALTER TABLE `sf_user_data` ADD CONSTRAINT `FK_E904BFD12E864BE8` FOREIGN KEY (`main_activity_type_id`) REFERENCES `sf_activity_types` (`id`), ADD CONSTRAINT `FK_E904BFD15543A800` FOREIGN KEY (`main_activity_id`) REFERENCES `sf_activities` (`id`), ADD CONSTRAINT `FK_E904BFD1A76ED395` FOREIGN KEY (`user_id`) REFERENCES `account` (`id`); INSERT INTO `account` (`id`, `identifier`) VALUES ('1', ' martin@domain.com '), ('2', ' robert@domain.com '), ('3', ' bruce@domain.com '); INSERT INTO `sf_geo_countries` (`id`, `code`, `alpha2`, `alpha3`, `name_en`, `name_fr`, `is_default`) VALUES ('1', '1', 'FR', 'FRA', 'France', 'France', '1'); INSERT INTO `address` (`id`, `city`, `street`, `complement`, `zipcode`, `country_id`, `cedex`, `abroad`) VALUES ('1', 'Paris', '1 rue de Paris', NULL, '75001', '1', '0', '0'), ('2', 'Marseille', '1 rue de Marseille', NULL, '13000', '1', '0', '0'); INSERT INTO `customer` (`id`, `account`, `customer_address`, `email`, `lastname`, `firstname`, `no_pub`) VALUES ('1', '1', '1', ' martin@domain.com ', 'Scorcese', 'Martin', '0'), ('2', '2', '2', ' robert@domain.com ', 'De Niro', 'Robert', '0'), ('3', '3', NULL, ' bruce@domain.com ', 'Willis', 'Bruce', '0'); INSERT INTO `sf_activities` (`id`, `name`, `asks_for_custom`) VALUES ('1', 'Activity #1', '0'), ('2', 'Activity #2', '0'); INSERT INTO `sf_activity_types` (`id`, `name`, `identifier`) VALUES ('1', 'Activity Type #1', 'activity-type-1'), ('2', 'Activity Type #2', 'activity-type-2'); INSERT INTO `sf_geo_regions` (`id`, `name`) VALUES ('1', 'Ile-de-France'), ('2', 'Cote d''Azur'); INSERT INTO `sf_geo_cities` (`id`, `region_id`, `name`, `slug`, `zipcode`, `insee_code`, `latitude`, `longitude`) VALUES ('1', '1', 'Paris', 'paris', '75001', '1', '0', '0'), ('2', '2', 'Marseille', 'marseille', '13000', '2', '0', '0'); INSERT INTO `sf_user_data` (`id`, `user_id`, `main_activity_type_id`, `main_activity_id`) VALUES ('1', '1', '1', '1'), ('2', '3', '2', '2'); 

With this minimum amount of data, I can run the following query, which returns to me everything that I want. But this query has been going on for centuries in my real database with 300k + clients and 400k + addresses.

 SELECT c.id, c.email, c.firstname, c.lastname, acttypes.name AS activity_type, act.name AS activity, gc.name AS city, gr.name AS region FROM customer c JOIN account a ON a.id = c.account LEFT JOIN sf_user_data ud ON ud.user_id = a.id LEFT JOIN sf_activity_types acttypes ON acttypes.id = ud.main_activity_type_id LEFT JOIN sf_activities act ON act.id = ud.main_activity_id LEFT JOIN address ad ON ad.id = c.customer_address LEFT JOIN sf_geo_cities gc ON gc.zipcode = ad.zipcode LEFT JOIN sf_geo_regions gr ON gr.id = gc.region_id WHERE no_pub = 0 

UPDATE # 3: tracking link issue

This query is fast:

 SELECT c.id ,c.email ,c.firstname ,c.lastname ,acttypes.name AS activity_type ,act.name AS activity ,ad.zipcode AS address_zipcode -- ,gc.name AS city -- ,gr.name AS region FROM customer c JOIN account a ON a.id = c.account LEFT JOIN sf_user_data ud ON ud.user_id = a.id LEFT JOIN sf_activity_types acttypes ON acttypes.id = ud.main_activity_type_id LEFT JOIN sf_activities act ON act.id = ud.main_activity_id LEFT JOIN address ad ON ad.id = c.customer_address -- LEFT JOIN sf_geo_cities gc ON gc.zipcode = ad.zipcode -- LEFT JOIN sf_geo_regions gr ON gr.id = gc.region_id WHERE no_pub = 0 

This query also works fast:

 SELECT c.id ,c.email ,c.firstname ,c.lastname -- ,acttypes.name AS activity_type -- ,act.name AS activity ,ad.zipcode AS address_zipcode ,gc.name AS city -- ,gr.name AS region FROM customer c JOIN account a ON a.id = c.account -- LEFT JOIN sf_user_data ud ON ud.user_id = a.id -- LEFT JOIN sf_activity_types acttypes ON acttypes.id = ud.main_activity_type_id -- LEFT JOIN sf_activities act ON act.id = ud.main_activity_id LEFT JOIN address ad ON ad.id = c.customer_address LEFT JOIN sf_geo_cities gc ON gc.zipcode = ad.zipcode -- LEFT JOIN sf_geo_regions gr ON gr.id = gc.region_id WHERE no_pub = 0 

This query runs forever:

 SELECT c.id ,c.email ,c.firstname ,c.lastname ,acttypes.name AS activity_type ,act.name AS activity ,ad.zipcode AS address_zipcode ,gc.name AS city -- ,gr.name AS region FROM customer c JOIN account a ON a.id = c.account LEFT JOIN sf_user_data ud ON ud.user_id = a.id LEFT JOIN sf_activity_types acttypes ON acttypes.id = ud.main_activity_type_id LEFT JOIN sf_activities act ON act.id = ud.main_activity_id LEFT JOIN address ad ON ad.id = c.customer_address LEFT JOIN sf_geo_cities gc ON gc.zipcode = ad.zipcode -- LEFT JOIN sf_geo_regions gr ON gr.id = gc.region_id WHERE no_pub = 0 

Matches on actions are OK, joining addresses is OK, but trying to use all these connections together doesn't work.

+4
source share
1 answer

As mentioned in my comment, it is important to indicate how many records are returned. Too many records in one of the statements can cause a problem, and you might be better off posting the results by selecting n records at a time.

To troubleshoot a query, create one for a connection to try to determine when performance falls on a rock. Start with:

 SELECT c.id ,c.email ,c.firstname ,c.lastname -- ,acttypes.name AS activity_type -- ,act.name AS activity -- ,gc.name AS city -- ,gr.name AS region FROM customer c -- JOIN account a ON a.id = c.account -- LEFT JOIN sf_user_data ud ON ud.user_id = a.id -- LEFT JOIN sf_activity_types acttypes ON acttypes.id = ud.main_activity_type_id -- LEFT JOIN sf_activities act ON act.id = ud.main_activity_id -- LEFT JOIN address ad ON ad.id = c.customer_address -- LEFT JOIN sf_geo_cities gc ON gc.zipcode = ad.zipcode -- LEFT JOIN sf_geo_regions gr ON gr.id = gc.region_id WHERE no_pub = 0 

Then do not comment one connection at a time until your request is very slow. Analyze this query and see if it can be improved with indexes. Then continue not to comment on the connections until you get the full request.

+1
source

All Articles