Removing duplicates from multiple join result in tables with different columns in MySQL

Question

Removing duplicates from multiple join result in tables with different columns in MySQL

I am trying to make one statement to pull data from 3 related tables (since they all have a common row index). I am having problems so that MySQL does not return a product from two tables, making the result much larger than I want. Each table has a different number of columns, and I would prefer not to use UNION in any case, because the data in each table is separate.

Here is an example:

Table X is the main table and has fields A B.

Table Y has AC D fields.

Table Z has fields AEF G.

-

My ideal result would be of the form:

A1 B1 C1 D1 E1 F1 G1 A1 B2 C2 D2 00 00 00 A2 B3 C3 D3 E2 F2 G2 A2 B4 00 00 E3 F3 G3

etc...

-

Here is the simplest SQL I tried that shows my problem (i.e. returns the product Y * Z indexed by data from A:

 SELECT DISTINCT * FROM X LEFT JOIN Y USING (A) LEFT JOIN Z USING (A)

-

I tried to add a group by clause to the fields in Y and Z. But if I only group by one column, it returns only the first result matched with each unique value in this column (that is: A1 C1 E1, A1 C2 E1, A1 C3 E1). And if I group by two columns, it again returns the product of two tables.

I also tried to make several select queries in the query, and then I join the resulting tables, but again I got the result of these tables.

Basically, I want to combine the results of three select statements into one result, without giving me all the data combinations. If I need, I can resort to doing a few queries. However, since they all contain a common index, I believe there should be a way to do this in one query, which I am missing.

Thanks for any help.

+4

sql join mysql distinct combinations

Alfred Apr 24 '11 at 1:34

source share

5 answers

Andrew Lazarus · Answer 1 · 2011-04-24T01:44:09+0000

I don’t know if I understand your problem, but why are you using LEFT JOIN? This story is more like INNER JOIN. Nothing here requires an UNION.

[Edit] Well, I think I see now what you want. I have never tried what I offer, and what else, some DBs do not support it (yet), but I think you want a windows function.

 WITH Y2 AS (SELECT Y.*, ROW_NUMBER() OVER (PARTITION BY A) AS YROW FROM Y), Z2 AS (SELECT Z.*, ROW_NUMBER() OVER (PARTITION BY A) AS ZROW FROM Z) SELECT COALESCE(Y2.A,Z2.A) AS A, Y2.C, Y2.D, Z2.E, Z2.F, Z2.G FROM Y2 FULL OUTER JOIN Z2 ON Y2.A=Z2.A AND YROW=ZROW;

The idea is to print a list of as few lines as possible, right? Therefore, if A1 has 10 entries in Y and 7 in Z, we get 10 rows with 3 having NULL for Z-fields. This works in Postgres. I do not believe that this syntax is available in MySQL.

At

  a | d | c ---+---+---- 1 | 1 | -1 1 | 2 | -1 2 | 0 | -1

Z:

  a | f | g | e ---+---+---+--- 1 | 9 | 9 | 0 2 | 1 | 1 | 0 3 | 0 | 1 | 0

The output of the expression above:

  a | c | d | e | f | g ---+----+---+---+---+--- 1 | -1 | 1 | 0 | 9 | 9 1 | -1 | 2 | | | 2 | -1 | 0 | 0 | 1 | 1 3 | | | 0 | 0 | 1

Christo · Answer 2 · 2011-04-24T02:00:17+0000

Yes, UNION not the answer.

I think you want:

 SELECT * FROM x JOIN y ON xa = ya JOIN z ON xa = za GROUB BY xa;

HEARTBEAT · Answer 3 · 2011-07-27T04:57:43+0000

I found a new way to edit this post, and this can be used to join two tables according to unique identifiers.
Try the following:

 create table y ( a int, d int, c int ) create table z ( a int, f int, g int, e int ) go insert into y values(1,1,-1) insert into y values(1,2,-1) insert into y values(2,0,-1) insert into z values(1,9,9,0) insert into z values(2,1,1,0) insert into z values(3,0,1,0) go select * from y select * from z WITH Y2 AS (SELECT Y.*, ROW_NUMBER() OVER (ORDER BY A) AS YROW FROM Y where A = 3), Z2 AS (SELECT Z.*, ROW_NUMBER() OVER (ORDER BY A) AS ZROW FROM Z where A = 3) SELECT COALESCE(Y2.A,Z2.A) AS A, Y2.C, Y2.D, Z2.E, Z2.F, Z2.G FROM Y2 FULL OUTER JOIN Z2 ON Y2.A=Z2.A AND YROW=ZROW;

Morg. · Answer 4 · 2011-09-22T16:03:52+0000

PostgreSQL is always the right answer to most MySQL problems, but your problem could be solved as follows:

The problem you ran into was that you had two left connections, i.e.

Left join X is a left join Y that inevitably gives you A x X x Y where you wanted (AxX) x (AxY)

A simple solution could be:

 select xA,xB,xC,xD,yE,yF,yG from (SELECT AA,AB,XC,XD FROM A LEFT JOIN X ON AA=XA) x INNER JOIN (SELECT AA,YE,YF,YG FROM A LEFT JOIN Y ON AA=YA) y ON xA=yA

Test Details:

 CREATE TABLE A (A varchar(3),B varchar(3)); CREATE TABLE X (A varchar(3),C varchar(3), D varchar(3)); CREATE TABLE Y (A varchar(3),E varchar(3), F varchar(3), G varchar(3)); INSERT INTO A(A,B) VALUES ('A1','B1'), ('A2','B2'), ('A3','B3'), ('A4','B4'); INSERT INTO X(A,C,D) VALUES ('A1','C1','D1'), ('A3','C3','D3'), ('A4','C4','D4'); INSERT INTO Y(A,E,F,G) VALUES ('A1','E1','F1','G1'), ('A2','E2','F2','G2'), ('A4','E4','F4','G4'); select xA,xB,xC,xD,yE,yF,yG from (SELECT AA,AB,XC,XD FROM A LEFT JOIN X ON AA=XA) x INNER JOIN (SELECT AA,YE,YF,YG FROM A LEFT JOIN Y ON AA=YA) y ON xA=yA

As a result, yes, MySQL has many many problems, but this is not one of them - most problems relate to more advanced materials.

ypercubeᵀᴹ · Answer 5 · 2011-09-22T16:18:23+0000

If I understand correctly, table X has a 1:n relationship with tables Y and Z So, the behavior you see is expected. The result you get is a kind of cross-product.

If X contains Person data, Y has address data for these persons, and Z has phone data for these persons, then naturally your request shows all combinations of addresses and phones for each person. If someone has 3 addresses and 4 phones in your tables, then the query displays 12 rows as a result.

You could avoid this by using a UNION query or by issuing two queries:

 SELECT X.* , Y.* FROM X LEFT JOIN Y ON YA = XA

and

 SELECT X.* , Z.* FROM X LEFT JOIN Z ON ZA = XA

Removing duplicates from multiple join result in tables with different columns in MySQL

More articles: