SQL query - Join, which returns the first two records of a table join

I have two tables:

A patient

  • pkPatientId
  • Firstname
  • Surname

Patientstatus

  • pkPatientStatusId
  • fkPatientId
  • Statuscode
  • Startdate
  • Endate

Patient β†’ PatientStatus - one to many relationship.

I am wondering if it is possible in SQL to make a join that returns only the first two PatientStatus entries for each patient. If there is only one PatientStatus entry, this should not be returned in the results.

Normal connection of my request:

SELECT FROM Patient p INNER JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId ORDER BY ps.fkPatientId, ps.StartDate 
+7
sql join sql-server
source share
8 answers

CTE is probably best if you are in SQL Server 2005 or higher, but if you need something more compatible with other platforms, this should work:

 SELECT P.pkPatientID, P.FirstName, P.LastName, PS1.StatusCode AS FirstStatusCode, PS1.StartDate AS FirstStatusStartDate, PS1.EndDate AS FirstStatusEndDate, PS2.StatusCode AS SecondStatusCode, PS2.StartDate AS SecondStatusStartDate, PS2.EndDate AS SecondStatusEndDate FROM Patient P INNER JOIN PatientStatus PS1 ON PS1.fkPatientID = P.pkPatientID INNER JOIN PatientStatus PS2 ON PS2.fkPatientID = P.pkPatientID AND PS2.StartDate > PS1.StartDate LEFT OUTER JOIN PatientStatus PS3 ON PS3.fkPatientID = P.pkPatientID AND PS3.StartDate < PS1.StartDate LEFT OUTER JOIN PatientStatus PS4 ON PS4.fkPatientID = P.pkPatientID AND PS4.StartDate > PS1.StartDate AND PS4.StartDate < PS2.StartDate WHERE PS3.pkPatientStatusID IS NULL AND PS4.pkPatientStatusID IS NULL 

It seems a little strange to me that you need the first two states instead of the last two, but I assume that you know what you want.

You can also use WHERE NOT EXISTS instead of PS3 and PS4 connections if you get better performance with this.

+6
source share

Here is my attempt. It should work on SQL Server 2005 and SQL Server 2008 (tested on SQL Server 2008) due to the use of a common table expression:

 WITH CTE AS ( SELECT fkPatientId , StatusCode -- add more columns here , ROW_NUMBER() OVER ( PARTITION BY fkPatientId ORDER BY fkPatientId desc) AS [Row_Number] from PatientStatus where fkPatientId in ( select fkPatientId from PatientStatus group by fkPatientId having COUNT(*) >= 2 ) ) SELECT p.pkPatientId, p.FirstName, CTE.StatusCode FROM [Patient] as p INNER JOIN CTE ON p.[pkPatientId] = CTE.fkPatientId WHERE CTE.[Row_Number] = 1 or CTE.[Row_Number] = 2 
+4
source share

EDIT: Both of the following solutions require PatientStatus.StartDate be unique to each patient.

The traditional way (compatible with SQL Server 2000):

 SELECT p.pkPatientId, p.FirstName, p.Surname, ps.StatusCode, ps.StartDate, ps.EndDate FROM Patient p INNER JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId AND ps.StartDate IN ( SELECT TOP 2 StartDate FROM PatientStatus WHERE fkPatientId = ps.fkPatientId ORDER BY StartDate /* DESC (to switch between first/last records) */ ) WHERE EXISTS ( SELECT 1 FROM PatientStatus WHERE fkPatientId = p.pkPatientId GROUP BY fkPatientId HAVING COUNT(*) >= 2 ) ORDER BY ps.fkPatientId, ps.StartDate 

A more interesting alternative (you need to try how well it performs in comparison):

 SELECT p.pkPatientId, p.FirstName, p.Surname, ps.StatusCode, ps.StartDate, ps.EndDate FROM Patient p INNER JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId WHERE /* the "2" is the maximum number of rows returned */ 2 > ( SELECT COUNT(*) FROM Patient p_i INNER JOIN PatientStatus ps_i ON p_i.pkPatientId = ps_i.fkPatientId WHERE ps_i.fkPatientId = ps.fkPatientId AND ps_i.StartDate < ps.StartDate /* switch between "<" and ">" to get the first/last rows */ ) AND EXISTS ( SELECT 1 FROM PatientStatus WHERE fkPatientId = p.pkPatientId GROUP BY fkPatientId HAVING COUNT(*) >= 2 ) ORDER BY ps.fkPatientId, ps.StartDate 

Note: for MySQL, the last query may be the only alternative - as long as LIMIT is not supported in subqueries.

EDIT: I added a condition that excludes patients with only one PatientStatus . (Thanks for the tip, Ryan !)

+2
source share

I did not try, but it could work;

 SELECT /*(your select columns here)*/, row_number() over(ORDER BY ps.fkPatientId, ps.StartDate) as rownumber FROM Patient p INNER JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId where rownumber between 1 and 2 

if this did not work, see this link.

+1
source share

Adding this WHERE clause to the external query of Tomalak's first solution will prevent patient returns with less than two status reports. You can also β€œand” in the WHERE clause of the second query for the same results.

 WHERE pkPatientId IN ( SELECT pkPatientID FROM Patient JOIN PatientStatus ON pkPatientId = fkPatientId GROUP BY pkPatientID HAVING Count(*) >= 2 ) 
+1
source share

Check if your server supports window functions:

 SELECT * FROM Patient p LEFT JOIN PatientStatus ps ON p.pkPatientId = ps.fkPatientId QUALIFY ROW_NUMBER() OVER (PARTITION BY ps.fkPatientId ORDER BY ps.StartDate) < 3 

Another feature that should work with SQL Server 2005:

 SELECT * FROM Patient p LEFT JOIN ( SELECT *, ROW_NUMBER(PARTITION BY fsPatientId ORDER by StartDate) rn FROM PatientStatus) ps ON p.pkPatientId = ps.fkPatientID and ps.rn < 3 
+1
source share

Here is how I would like to approach this:

 -- Patients with at least 2 status records with PatientsWithEnoughRecords as ( select fkPatientId from PatientStatus as ps group by fkPatientId having count(*) >= 2 ) select top 2 * from PatientsWithEnoughRecords as er left join PatientStatus as ps on er.fkPatientId = ps.fkPatientId order by StartDate asc 

I'm not sure what defines the "first" two status records in your case, so I assumed that you need the earliest two StartDate ** s. Change the last sentence ** order by to get the records you are interested in.

Edit : SQL Server 2000 does not support CTE, so this solution will only work directly from 2005 and later.

0
source share

Awful, but it does not depend on the uniqueness of StartDate and works on SQL 2000

 select * from Patient p join PatientStatus ps on p.pkPatientId=ps.fkPatientId where pkPatientStatusId in ( select top 2 pkPatientStatusId from PatientStatus where fkPatientId=ps.fkPatientId order by StartDate ) and pkPatientId in ( select fkPatientId from PatientStatus group by fkPatientId having count(*)>=2 ) 
0
source share

All Articles