How do I group records with a time difference of more than an hour?

I am new to this site, but carry it with me.

I am trying to GROUP BY some data to use SQL Server.

Here is the data:

 Computer VisitDate ComputerA 2012-04-28 09:00:00 ComputerA 2012-04-28 09:05:00 ComputerA 2012-04-28 09:10:00 ComputerB 2012-04-28 09:30:00 ComputerB 2012-04-28 09:32:00 ComputerB 2012-04-28 09:44:00 ComputerB 2012-04-28 09:56:00 ComputerB 2012-04-28 10:25:00 ComputerA 2012-04-28 12:25:00 ComputerC 2012-04-28 12:30:00 ComputerC 2012-04-28 12:35:00 ComputerC 2012-04-28 12:45:00 ComputerC 2012-04-28 12:55:00 

What I'm trying to achieve is group data using a computer, but also group if the computer has a difference between visit times of more than 1 hour. Here is the result of what I'm trying to do:

 Computer VisitDate ComputerA 2012-04-28 09:00:00 ComputerB 2012-04-28 09:30:00 ComputerA 2012-04-28 12:25:00 ComputerC 2012-04-28 12:30:00 

So, Computer A displayed twice because it visited at 09:10:00 and then again visited at 12:25:00, which means a difference of more than 1 hour.

Itโ€™s easy to โ€œGROUP BY Computer,โ€ but the other, I donโ€™t know where to start. Any help on this issue would be greatly appreciated.

+8
sql sql-server group-by
source share
4 answers

You cannot do this with a simple GROUP BY . This operator works only with individual columns - for example, you can group a computer name or something like that, but you cannot add additional logic, since the time difference should be more than one hour or something similar for grouping.

What you can do - if you are using SQL Server 2005 or later (you did not mention the version in your question), it would be to use CTE (Common Table Expressions). They provide a way to cut your data.

Here I do a few things - first I โ€œsplitโ€ the data into ComputerName and order VisitDate and use ROW_NUMBER() to get the serial number for each section. Then the second CTE determines the โ€œfirstโ€ entry for each computer - the number with the line number = 1 - and the third finally determines the difference in VisitDate for each record compared to the record with the line number = 1. From this third CTE, I finally select those records that either have a line number = 1 (the first for each "section"), or everything that matters in 60 minutes or more.

Here is the code:

 ;WITH Computers AS ( SELECT ComputerName, VisitDate, RN = ROW_NUMBER() OVER(PARTITION BY ComputerName ORDER BY VisitDate) FROM dbo.YourComputerTable ), FirstComputers AS ( SELECT ComputerName, VisitDate FROM Computers WHERE RN = 1 ), SelectedComputers AS ( SELECT c.ComputerName, c.VisitDate, c.RN, DiffToFirst = ABS(DATEDIFF(MINUTE, c.VisitDate, fc.VisitDate)) FROM Computers c INNER JOIN FirstComputers fc ON c.ComputerName = fc.ComputerName ) SELECT * FROM SelectedComputers WHERE RN = 1 OR DiffToFirst >= 60 
+3
source share

If you upgraded to SQL Server 2012, you can use the LAG for this.

 with Lagged as ( select Computer, VisitDate, LAG(VisitDate,1) over ( partition by Computer order by VisitDate ) as LastVisit from @Visit ) select Computer, VisitDate from Lagged where LastVisit is null or VisitDate > dateadd(hour,1,LastVisit); 

SQL feed here .

+2
source share

This solution is based on a recursive CTE. Here you can find an online demo .

 WITH CteBase AS ( SELECT v.Computer, v.VisitDate, ROW_NUMBER() OVER(PARTITION BY v.Computer ORDER BY v.VisitDate) AS RowNum FROM @Visit v ), CteRecursive AS ( SELECT crt.Computer, crt.VisitDate, crt.VisitDate AS GroupStartVisitDate, crt.RowNum, 1 AS ComputerVisitRowNum FROM CteBase crt WHERE crt.RowNum = 1 UNION ALL SELECT crt.Computer, crt.VisitDate, CASE WHEN DATEDIFF(MINUTE, prv.GroupStartVisitDate, crt.VisitDate) <= 60 THEN prv.GroupStartVisitDate ELSE crt.VisitDate END, crt.RowNum, CASE WHEN DATEDIFF(MINUTE, prv.GroupStartVisitDate, crt.VisitDate) <= 60 THEN prv.ComputerVisitRowNum + 1 ELSE 1 END FROM CteBase crt INNER JOIN CteRecursive prv ON crt.Computer = prv.Computer AND crt.RowNum = prv.RowNum + 1 ) SELECT r.Computer, r.GroupStartVisitDate FROM CteRecursive r WHERE r.ComputerVisitRowNum = 1; 

Results:

 Computer GroupStartVisitDate -------------------- ----------------------- ComputerA 2012-04-28 09:00:00.000 ComputerB 2012-04-28 09:30:00.000 ComputerC 2012-04-28 12:30:00.000 ComputerA 2012-04-28 12:25:00.000 

If you have any questions, feel free to ask.

+1
source share

CTE to show all computers with at least one visit, or visits before and after spaces> 60 minutes.

 create table compVisits (Computer varchar(20), VisitDate datetime) go insert into compVisits values ('ComputerA', '2012-04-28 09:00:00') , ('ComputerA', '2012-04-28 09:05:00') , ('ComputerA', '2012-04-28 09:10:00') , ('ComputerB', '2012-04-28 09:30:00') , ('ComputerB', '2012-04-28 09:32:00') , ('ComputerB', '2012-04-28 09:44:00') , ('ComputerB', '2012-04-28 09:56:00') , ('ComputerB', '2012-04-28 10:25:00') , ('ComputerA', '2012-04-28 12:25:00') , ('ComputerC', '2012-04-28 12:30:00') , ('ComputerC', '2012-04-28 12:35:00') , ('ComputerC', '2012-04-28 12:45:00') , ('ComputerC', '2012-04-28 12:55:00') ; WITH a as ( --Initial row count select *, r=ROW_NUMBER()OVER(PARTITION BY Computer ORDER BY VisitDate) FROM compVisits ) , b as ( -- gaps >60 minutes SELECT a1.Computer, a1.VisitDate FROM a a1 INNER JOIN a a2 ON a1.Computer=a2.Computer AND (a1.r+1)=a2.r AND DATEDIFF(MINUTE,a1.VisitDate,a2.VisitDate)>60 UNION SELECT a2.Computer, a2.VisitDate FROM a a1 INNER JOIN a a2 ON a1.Computer=a2.Computer AND (a1.r+1)=a2.r AND DATEDIFF(MINUTE,a1.VisitDate,a2.VisitDate)>60 ) -- at least one visit SELECT a1.Computer, a1.VisitDate FROM a a1 WHERE r=1 AND NOT EXISTS(SELECT 1 FROM b WHERE Computer=a1.Computer) UNION -- gaps >60 minutes SELECT * FROM b ORDER BY VisitDate 

Result:

enter image description here

0
source share

All Articles