Internal Join table

I have a table that uses two identification columns, let them call them id and userid. The identifier is unique in each record, and the userid is unique to the user, but in many records.

What I need to do is get the record for the user by the user ID, and then attach this record to the first record that we have for the user. The request logic is as follows:

SELECT v1.id, MIN(v2.id) AS entryid, v1.userid FROM views v1 INNER JOIN views v2 ON v1.userid = v2.userid 

I hope that I do not need to join the table in a subquery that handles the min () part of the code, as this seems rather slow.

+4
source share
3 answers

I assume (this is not entirely clear) that you want to find for each user, rows in the table that have a minimum id , therefore one row for each user.

In this case, you use a subquery (view) and attach it to the table:

 SELECT v.* FROM views AS v JOIN ( SELECT userid, MIN(id) AS entryid FROM views GROUP BY userid ) AS vm ON vm.userid = v.userid AND vm.entryid = v.id ; 

The above can also be written using Common Table Expression (CTE) if you like them:

 ; WITH vm AS ( SELECT userid, MIN(id) AS entryid FROM views GROUP BY userid ) SELECT v.* FROM views AS v JOIN vm ON vm.userid = v.userid AND vm.entryid = v.id ; 

Both will be quite efficient with an index on (userid, id) .

With SQL-Server, you can write this using the window function ROW_NUMBER() :

 ; WITH viewsRN AS ( SELECT * , ROW_NUMBER() OVER (PARTITION BY userid ORDER BY id) AS rn FROM views ) SELECT * --- skipping the "rn" column FROM viewsRN WHERE rn = 1 ; 
+12
source

Well, to use the MIN function with non-aggregate columns, you need to group the operator. This is possible with a query that you have ... (EDIT based on more info)

 SELECT MIN(v2.id) AS entryid, v1.id, v1.userid FROM views v1 INNER JOIN views v2 ON v1.userid = v2.userid GROUP BY v1.id, v1.userid 

... however, if this is just a simple example, and you want to get more data with this query, it will quickly become an impossible solution.

What you think is necessary is a list of all user data in this view with a link to each line leading back to the first record that exists for the same user. The above query will give you what you want, but there are much simpler ways to determine the first entry for each user:

 SELECT v1.id, v1.userid FROM views v1 ORDER BY v1.userid, v1.id 

The first entry for each unique user is your entry point. I think I understand why you want to do it the way you indicated, and the first query I gave will be quite effective, but you will have to think about whether to use the order by clause to get the right answer ..

+1
source

edit-1: as pointed out in the comments, this solution also uses a subquery. However, it does not use aggregate functions, which (depending on the database) can have a huge impact on performance.


It can be performed without a subquery (see below). Obviously, the index on views.userid matters for performance.

 SELECT v1.* FROM views v1 WHERE v1.id = ( SELECT TOP 1 v2.id FROM views v2 WHERE v2.userid = v1.userid ORDER BY v2.id ASC ) 
-2
source

All Articles