You need to create an expression in the outer join that returns only one row

Question

You need to create an expression in the outer join that returns only one row

I am creating a really complex dynamic sql, it should return one row for each user, but now I need to join one table. I am doing an outer join to make sure that I believe at least one row back (and I can check zero to see if there is data in this table), but I have to make sure that I get only one row from this outer part of the join, if its several rows of this second table for this user. So far I came up with this: (sybase)

select a.user_id from table1 a, table2 b where a.user_id = b.user_id and a.sub_id = (select min(c.sub_id) from table2 c where b.sub_id = c.sub_id)

The subquery finds the min value in table one for many for this particular user.

This works, but I'm afraid the nastiness comes from correlated subqueries when tables 1 and 2 get very large. Is there a better way? I am trying to come up with a way to join this, but I do not see it. Also saying “where rowcount = 1” or “top 1” does not help me because I'm not trying to fix the above query, I ADD the above to an already complex query.

+4

sql subquery

stu Oct 27 '08 at 13:52

source share

6 answers

Dónal · Answer 1 · 2008-10-27T14:37:29+0000

In MySql, you can guarantee that any query returns no more than X rows with

 select * from foo where bar = 1 limit X;

Unfortunately, I'm sure this is an extension for SQL for SQL. However, a Google search for something like "mysql sybase limit" might be equivalent to Sybase.

Tom h · Answer 2 · 2008-10-27T15:11:37+0000

A few quick points:

You must have clear business rules. If the query returns more than one row, then you need to think about why (besides just “this ratio is 1: many), WHY is this ratio 1: many?). You should come up with a business solution, and not just use“ min ”, because that it gives you line 1. A business decision may simply be “take the first”, in which case it may be min's answer, but you need to make sure a conscious decision.
You really have to use ANSI syntax for connections. Not only because it is standard, but because the syntax that you have does not actually do what you think it does (this is not an outer join), and some things are simply impossible to do with the syntax that you have.

Assuming you end up using the MIN solution, here is one possible solution without a subquery. You should test it with various other solutions to make sure they are equivalent in the results and see which ones work best.

 SELECT a.user_id, b.* FROM dbo.Table_1 a LEFT OUTER JOIN dbo.Table_2 b ON b.user_id = a.user_id AND b.sub_id = a.sub_id LEFT OUTER JOIN dbo.Table_2 c ON c.user_id = a.user_id AND c.sub_id < b.sub_id WHERE c.user_id IS NULL

You will need to check this to see if it really gives what you want and you may need to tweak it, but the main idea is to use a second LEFT OUTER JOIN to make sure there are no lines that exist with lower sub_id than one that is found in the first LEFT OUTER JOIN (if one is found). You can configure the criteria in the second LEFT OUTER JOIN depending on the final business rules.

Ady · Answer 3 · 2008-10-27T13:55:53+0000

Your example may be too simplistic, but I would use a group:

  SELECT
   a.user_id 
 FROM 
   table1 a
     LEFT OUTER JOIN table2 b ON (a.user_id = b.user_id)
 GROUP BY
   a.user_id

I am afraid that the only way would be to use nested queries:

The difference between this query and your example is a “helper table” that is generated only once, however in your example you create a “helper table” for each row in table1 (but it may depend on the compiler, so you might want to use a query analyzer for performance checks).

  SELECT
   a.user_id,
   b.sub_id
 FROM 
   table1 a
     LEFT OUTER JOIN (
       SELECT
         user_id
         min (sub_id) as sub_id,
       FROM
         table2
       GROUP BY
         user_id
     ) b ON (a.user_id = b.user_id)

Also, if your query becomes quite complex, I would use temporary tables to simplify the code, it may cost a little more during processing, but it will simplify your queries.

Example Temp Table:

  SELECT
   user_id
 INTO
   # table1
 FROM 
   table1
 WHERE
   .....

 SELECT
   a.user_id,
   min (b.sub_id) as sub_id,
 INTO
   # table2
 FROM
   # table1 a
     INNER JOIN table2 b ON (a.user_id = b.user_id)
 GROUP BY
   a.user_id

 SELECT
   a. *,
   b.sub_id
 from
   # table1 a
     LEFT OUTER JOIN # table2 b ON (a.user_id = b.user_id)

Tony Andrews · Answer 4 · 2008-10-27T14:21:11+0000

What about:

 select a.user_id from table1 a where exists (select null from table2 b where a.user_id = b.user_id )

James curran · Answer 5 · 2008-10-27T14:51:34+0000

First of all, I believe that the query you are trying to write as an example is the following:

 select a.user_id from table1 a, table2 b where a.user_id = b.user_id and b.sub_id = (select min(c.sub_id) from table2 c where b.user_id = c.user_id)

Except you need an external join (which I think someone edited the Oracle syntax).

 select a.user_id from table1 a left outer join table2 b on a.user_id = b.user_id where b.sub_id = (select min(c.sub_id) from table2 c where b.user_id = c.user_id)

neonski · Answer 6 · 2008-10-27T15:16:10+0000

Well, you already have a query that works. If speed bothers you, you can

Add a field to table2 that determines which sub_id is the "first" or
Track the primary key of table2 in table1 or in another table

You need to create an expression in the outer join that returns only one row

More articles: