In some cases, I did something similar. Essentially, grouping is based on separation in a complex order. The basics of the approach that I use regarding this issue are as follows:
- Create a table of all time periods of interest.
- Find the starting time for each group of time periods of interest.
- Find the final time for each group of time periods of interest.
- Attach the start and end time to the list of time ranges and group.
Or, in more detail: (each of these steps can be part of one large CTE, but I split it into temporary tables for readability ...)
Step 1: find a list of all the time ranges of interest (I used a method similar to the method associated with @Brad). NOTE. As @Manfred Sorg noted, this implies the absence of “missing seconds” in the bus data. If there are timestamps, this code will interpret one range as two (or more) different ranges.
;with stopSeconds as ( select BusID, BusStopID, TimeStamp, [date] = cast(datediff(dd,0,TimeStamp) as datetime), [grp] = dateadd(ss, -row_number() over(partition by BusID order by TimeStamp), TimeStamp) from
Step 2: find the earliest time for each stop
select this.BusID, this.BusStopID, this.sTime minSTime, [stopOrder] = row_number() over(partition by this.BusID, this.BusStopID order by this.sTime) into #starts from #ranges this left join #ranges prev on this.BusID = prev.BusID and this.BusStopID = prev.BusStopID and this.sOrd = prev.sOrd+1 and this.sTime between dateadd(mi,-10,prev.sTime) and dateadd(mi,10,prev.sTime) where prev.BusID is null
Step 3: find the last time for each stop
select this.BusID, this.BusStopID, this.eTime maxETime, [stopOrder] = row_number() over(partition by this.BusID, this.BusStopID order by this.eTime) into #ends from #ranges this left join #ranges next on this.BusID = next.BusID and this.BusStopID = next.BusStopID and this.eOrd = next.eOrd-1 and this.eTime between dateadd(mi,-10,next.eTime) and dateadd(mi,10,next.eTime) where next.BusID is null
Step 4: Put It All Together
select r.BusID, r.BusStopID, [avgLengthOfStop] = avg(datediff(ss,r.sTime,r.eTime)), [earliestStop] = min(r.sTime), [latestDepart] = max(r.eTime) from #starts s join #ends e on s.BusID=e.BusID and s.BusStopID=e.BusStopID and s.stopOrder=e.stopOrder join #ranges r on r.BusID=s.BusID and r.BusStopID=s.BusStopID and r.sTime between s.minSTime and e.maxETime and r.eTime between s.minSTime and e.maxETime group by r.BusID, r.BusStopID, s.stopOrder having count(distinct r.date) > 1 --filters out the "noise"
Finally, to be complete, put up:
drop table #ends drop table #starts drop table #ranges