Can I do this in a SQL function without a cursor?

I am working on a schedule database. In simple words, TimesheetEntries has four columns

ID int (identity, 1, 1) StaffID int ClockedIn datetime ClockedOut datetime 

I was asked to write a report to show staff attendance over a range of dates. The user enters a date, and the report displays the time and time of execution of all the employees present along with their duration on the spot.

However, when it becomes difficult, employees sometimes leave the house to leave the site for short periods of time, and the report should ignore them (when they leave the site for less than 2 hours).

So suppose the following entries

 ID StaffID ClockedIn ClockedOut 1 4 0900 1200 2 4 1330 1730 3 5 0900 1200 4 5 1409 1730 5 4 1830 1930 

Report Output SHOULD be

 StaffID ClockedIn ClockedOut 4 0900 1930 5 0900 1200 5 1409 1730 

Is there a way to do this without the cursor, or even the cursor nested inside the cursor (where am I now!)? We are not talking about huge datasets here, and performance is not really a problem (this is a report, not a production system), but I really don't like cursors if I can avoid them.



source share
5 answers

I used the data from Jeremy's answer above, but went about the problem in a completely different way. This uses a recursive CTE, which, it seems to me, requires SQL Server 2005. It accurately reports the results (I believe), and also reports the number of hours recorded during the timeframe and the total number of minutes (maybe more than 120, because the restriction is just is that each offsite period is less than two hours).

 declare @TimeSheetEntries table ( ID int identity not null primary key, StaffID int not null, ClockedIn datetime not null, ClockedOut datetime not null ); insert into @TimeSheetEntries ( StaffID, ClockedIn, ClockedOut ) select 4, '2012-01-01 09:00:00', '2012-01-01 12:00:00' union all select 4, '2012-01-01 13:30:00', '2012-01-01 17:30:00' union all select 5, '2012-01-01 09:00:00', '2012-01-01 12:00:00' union all select 5, '2012-01-01 14:09:00', '2012-01-01 17:30:00' union all select 4, '2012-01-01 18:30:00', '2012-01-01 19:30:00'; WITH ClockData AS ( SELECT ID, StaffID, ClockedIn, ClockedOut AS EffectiveClockout, 1 AS NumClockIns, 0 AS MinutesOff FROM @TimeSheetEntries ts WHERE NOT EXISTS (SELECT ID FROM @TimeSheetEntries tsWhere WHERE tsWhere.ClockedOut BETWEEN DATEADD(hour, -2, ts.ClockedIn) AND ts.ClockedIn) UNION ALL SELECT cd.ID, cd.StaffID, cd.ClockedIn, ts.ClockedOut AS EffectiveClockout, cd.NumClockIns + 1 AS NumClockIns, cd.MinutesOff + DateDiff(minute, cd.EffectiveClockout, ts.ClockedIn) AS MinutesOff FROM @TimeSheetEntries ts INNER JOIN ClockData cd ON ts.StaffID = cd.StaffID AND ts.ClockedIn BETWEEN cd.EffectiveClockout AND dateadd(hour, 2, cd.EffectiveClockout) ) SELECT * FROM ClockData cd WHERE NumClockIns = (SELECT MAX(NumClockIns) FROM ClockData WHERE ID = cd.ID) 

This returns:

 ID StaffID ClockedIn EffectiveClockout NumClockIns MinutesOff 3 5 2012-01-01 09:00:00.000 2012-01-01 12:00:00.000 1 0 4 5 2012-01-01 14:09:00.000 2012-01-01 17:30:00.000 1 0 1 4 2012-01-01 09:00:00.000 2012-01-01 19:30:00.000 3 150 


In case this is unclear, MinutesOff is just the โ€œallowanceโ€ time or the amount of time โ€œeatenโ€ between ClockedIn and EffectiveClockout shown on the same line. Thus, StaffID 5 canceled 129 minutes between synchronized time periods, but excluding latency, so MinutesOff is 0 for both rows.


I'm sure there are less complicated ways to do this, but I was able to remove it with a few CTEs:

 declare @TimeSheetEntries table ( ID int identity not null primary key, StaffID int not null, ClockedIn datetime not null, ClockedOut datetime not null ); insert into @TimeSheetEntries ( StaffID, ClockedIn, ClockedOut ) select 4, '2012-01-01 09:00:00', '2012-01-01 12:00:00' union all select 4, '2012-01-01 13:30:00', '2012-01-01 17:30:00' union all select 5, '2012-01-01 09:00:00', '2012-01-01 12:00:00' union all select 5, '2012-01-01 14:09:00', '2012-01-01 17:30:00' union all select 4, '2012-01-01 18:30:00', '2012-01-01 19:30:00' ; with MultiCheckins as ( select distinct StaffID, cast(cast(cast(ClockedIn as float) as int) as datetime) as TimeSheetDate, rank() over ( partition by StaffID, cast(cast(cast(ClockedIn as float) as int) as datetime) order by ClockedIn ) as ordinal, ClockedIn, ClockedOut from @TimeSheetEntries ), Organized as ( select row_number() over ( order by mc.StaffID, mc.TimeSheetDate, mc.ClockedIn, mc.ClockedOut ) as RowID, mc.StaffID, mc.TimeSheetDate, case when datediff(hour, coalesce(mc3.ClockedOut, mc.ClockedIn), mc.ClockedIn) >= 2 then mc.ClockedIn else coalesce(mc3.ClockedIn, mc.ClockedIn) end as ClockedIn, case when datediff(hour, mc.ClockedOut, coalesce(mc2.ClockedIn, mc.ClockedOut)) < 2 then coalesce(mc2.ClockedOut, mc.ClockedOut) else mc.ClockedOut end as ClockedOut from MultiCheckins as mc left outer join MultiCheckIns as mc3 on mc3.StaffID = mc.StaffID and mc3.TimeSheetDate = mc.TimeSheetDate and mc3.ordinal = mc.ordinal - 1 left outer join MultiCheckIns as mc2 on mc2.StaffID = mc.StaffID and mc2.TimeSheetDate = mc.TimeSheetDate and mc2.ordinal = mc.ordinal + 1 ) select distinct o.StaffID, o.ClockedIn, o.ClockedOut from Organized as o where not exists ( select null from Organized as o2 where o2.RowID <> o.RowID and o2.StaffID = o.StaffID and ( o.ClockedIn between o2.ClockedIn and o2.ClockedOut and o.ClockedOut between o2.ClockedIn and o2.ClockedOut ) ) 

Option 1: Paste it into the temporary table, then use the left join to build the results table (if they can only enter and exit twice in time, this will work if you have 3 results that it will not have)

 select * from timesheet ts left join timesheet tss on = 

After that, you can simply get min and max or even get a more reliable report.

Option 2:

 create #TimeTable Table (UserID int, InTime int, OutTime int) insert into #TimeTable (UserID) select distinct StaffID Update #TimeTable set InTime = (select Min(InTime) from #TimeTable where StaffID = s.StaffID) from #TimeTAble s Update #TimeTable set OutTime = (Select Max(OutTime) from #TimeTable where StaffID = s.StaffID) from #TimeTable s 

Given the time of mroe, I combined them into two quick queries, but three will work without worrying about performance.


A set-based iterative approach:

 -- Sample data. declare @TimesheetEntries as Table ( Id Int Identity, StaffId Int, ClockIn DateTime, ClockOut DateTime ) insert into @TimesheetEntries ( StaffId, ClockIn, ClockOut ) values ( 4, '2012-05-03 09:00', '2012-05-03 12:00' ), ( 4, '2012-05-03 13:30', '2012-05-03 17:30' ), -- This falls within 2 hours of the next two rows. ( 4, '2012-05-03 17:35', '2012-05-03 18:00' ), ( 4, '2012-05-03 19:00', '2012-05-03 19:30' ), ( 4, '2012-05-03 19:45', '2012-05-03 20:00' ), ( 5, '2012-05-03 09:00', '2012-05-03 12:00' ), ( 5, '2012-05-03 14:09', '2012-05-03 17:30' ), ( 6, '2012-05-03 09:00', '2012-05-03 12:00' ), ( 6, '2012-05-03 13:00', '2012-05-03 17:00' ) select Id, StaffId, ClockIn, ClockOut from @TimesheetEntries -- Find all of the periods that need to be coalesced and start the process. declare @Bar as Table ( Id Int Identity, StaffId Int, ClockIn DateTime, ClockOut DateTime ) insert into @Bar select TSl.StaffId, TSl.ClockIn, TSr.ClockOut from @TimesheetEntries as TSl inner join -- The same staff member and the end of the left period is within two hours of the start of the right period. @TimesheetEntries as TSr on TSr.StaffId = TSl.StaffId and DateDiff( ss, TSl.ClockOut, TSr.ClockIn ) between 0 and 7200 -- Continue coalescing periods until we run out of work. declare @Changed as Bit = 1 while @Changed = 1 begin set @Changed = 0 -- Coalesce periods. update Bl -- Take the later ClockOut time from the two rows. set ClockOut = case when Br.ClockOut >= Bl.ClockOut then Br.ClockOut else Bl.ClockOut end from @Bar as Bl inner join @Bar as Br on Br.StaffId = Bl.StaffId and -- The left row started before the right and either the right period is completely contained in the left or the right period starts within two hours of the end of the left. Bl.ClockIn < Br.ClockIn and ( Br.ClockOut <= Bl.ClockOut or DateDiff( ss, Bl.ClockOut, Br.ClockIn ) < 7200 ) if @@RowCount > 0 set @Changed = 1 -- Delete rows where one period is completely contained in another. delete Br from @Bar as Bl inner join @Bar as Br on Br.StaffId = Bl.StaffId and ( ( Bl.ClockIn < Br.ClockIn and Br.ClockOut <= Bl.ClockOut ) or ( Bl.ClockIn <= Br.ClockIn and Br.ClockOut < Bl.ClockOut ) ) if @@RowCount > 0 set @Changed = 1 end -- Return all of the coalesced periods ... select StaffId, ClockIn, ClockOut, 'Coalesced Periods' as [Type] from @Bar union all -- ... and all of the independent periods. select StaffId, ClockIn, ClockOut, 'Independent Period' from @TimesheetEntries as TS where not exists ( select 42 from @Bar where StaffId = TS.StaffId and ClockIn <= TS.ClockIn and TS.ClockOut <= ClockOut ) order by ClockIn, StaffId 

I am sure there are some optimizations that need to be made.


I think you can do this quite easily, only with a left connection back to yourself and a one-time match. The following is not a complete implementation, but rather a proof of concept:

 create table #TimeSheetEntries ( ID int identity not null primary key, StaffID int not null, ClockedIn datetime not null, ClockedOut datetime not null ); insert into #TimeSheetEntries ( StaffID, ClockedIn, ClockedOut ) select 4, '2012-01-01 09:00:00', '2012-01-01 12:00:00' union all select 4, '2012-01-01 13:30:00', '2012-01-01 17:30:00' union all select 5, '2012-01-01 09:00:00', '2012-01-01 12:00:00' union all select 5, '2012-01-01 14:09:00', '2012-01-01 17:30:00' union all select 4, '2012-01-01 18:30:00', '2012-01-01 19:30:00' union all select 4, '2012-01-01 18:30:00', '2012-01-01 19:30:00'; select * from #timesheetentries tse1 left outer join #timesheetentries tse2 on tse1.staffid = tse2.staffid and = ( select MAX(ID) from #timesheetentries ts_max where < and tse1.staffid = ts_max.staffid ) outer apply ( select DATEDIFF(minute, tse2.clockedout, tse1.clockedin) as BreakTime ) as breakCheck where BreakTime > 120 or BreakTime < 0 or is null order by tse1.StaffID, tse1.ClockedIn GO drop table #timesheetentries GO 

The idea here is that you have the original schedule table tse1 , and then you make a left join to the same schedule table, an alias like tse2 and the corresponding rows when staffID matches and tse2.ID is the highest identifier value that is even less than tse1.ID . This is clearly a bad form - you probably want to use ROW_NUMBER() to compare identifiers separated and ordered by staffID and your ClockedIn / ClockedOut , since times could be entered from a chronological order.

At this point, the row from the joined tables now contains the time data from the current schedule entry, as well as one in front of it. This means that we can make a comparison between ClockedIn / ClockedOut for consecutive time entries ... and using DATEDIFF() , we can find out how much time a user has spent between their previous ClockedOut and later ClockedIn values. I used OUTER APPLY to do this simply because it makes the code cleaner, but you can probably pack it into a subquery.

Once we execute DATEDIFF() , it is trivial to find cases where an individual BreakTime does not exceed the 120-minute barrier and deletes schedule entries, leaving only meaningful employee schedule lines that will be used in your later reporting.



All Articles