How can I combine overlapping times in a window?

I have a table like this:

CREATE TABLE #TEMP (Name VARCHAR(255), START_TIME datetime, END_TIME datetime); INSERT INTO #TEMP VALUES('John', '2012-01-01 09:00:01', '2012-01-01 12:00:02') INSERT INTO #TEMP VALUES('John', '2012-01-01 09:40:01', '2012-01-01 11:00:02') INSERT INTO #TEMP VALUES('John', '2012-01-02 05:00:01', '2012-01-02 05:15:02') INSERT INTO #TEMP VALUES('David', '2012-01-04 05:00:01', '2012-01-04 05:15:02') INSERT INTO #TEMP VALUES('David', '2012-01-05 07:01:01', '2012-01-05 15:15:02') SELECT * FROM #TEMP DROP TABLE #TEMP 

And the data:

  Name START_TIME END_TIME 1 John 2012-01-01 09:00:01.000 2012-01-01 12:00:02.000 2 John 2012-01-01 09:40:01.000 2012-01-01 11:00:02.000 3 John 2012-01-02 05:00:01.000 2012-01-02 05:15:02.000 4 David 2012-01-04 05:00:01.000 2012-01-04 05:15:02.000 5 David 2012-01-05 07:01:01.000 2012-01-05 08:15:02.000 

Given a number, say, 6, I'm trying to do GROUP BY in this table and a merge time that overlaps for 6 hours before and after. Therefore, in the above table, lines 1 and 2 will be combined into one line, because they contain an overlapping time range:

 John 2012-01-01 06:00:01.000 2012-01-01 18:00:02.000 

Lines 4 and 5 will be merged, because the subtraction of 6 hours from 07:01:01.000 falls into the window of line 4 .

Is there a good way to do this on a large table containing about a million rows?

+4
source share
1 answer

I think the best way to do this is to create a windows table and join the #temp table with this new window table:

1) Step 1, preparing a window table with all possible window breaks (contains overlapping windows):

  SELECT Name, dateadd(hour, -6, start_time) as start_w, dateadd(hour, +6, start_time) as end_w into #possible_windows FROM #TEMP 

2) Create an index in temp table to improve performance

  create index pw_idx on #possible_windows ( Name, start_w) 

3) Exclude overlapping windows in the choice of self-join. This is the reason for creating the index:

  select p2.* into #myWindows from #possible_windows p1 right outer join #possible_windows p2 on p1.name = p2.name and p2.start_w > p1.start_W and p2.start_w <= p1.end_w where p1.name is null 

4) Join your C # myWindows table or use it directly.

WORKERS:

 SELECT Name, dateadd(hour, -6, start_time) as start_w, dateadd(hour, +6, start_time) as end_w, ROW_NUMBER() over(partition by Name order by Name, dateadd(hour, -6, start_time) ) as rn into #possible_windows FROM #TEMP create index pw_idx on #possible_windows ( Name, start_w) select p2.* from #possible_windows p1 right outer join #possible_windows p2 on p1.name = p2.name and p2.start_w > p1.start_W and p2.start_w <= p1.end_w where p1.name is null 

Results:

 Name start_w end_w rn ----- ------------- ------------- -- David 2012-01-03 23:00:012012-01-04 11:00:011 David 2012-01-05 01:01:012012-01-05 13:01:012 John 2012-01-01 03:00:012012-01-01 15:00:011 John 2012-01-01 23:00:012012-01-02 11:00:013 

PE: Please come back with the performance test results!

+2
source

Source: https://habr.com/ru/post/1412125/


All Articles