How to consolidate the time_dt and expiration_dt sectors for consecutive records where some columns remain constant?

Does anyone have any suggestions on this issue? I am trying to use oracle SQL to consolidate the gaps from effective_dt to expiration_dt , where the values ​​in col_a , col_b and col_c remain constant, but only for sequential records where there are no changes in any of the 3 columns between them.

if this helps, we can safely assume that the effective date of the next record (by an employee) is equal to the previous record plus 1 day.

I tried with min() , max() and group by , but the problem in the following script will return 12 / 1-12 / 31. then I tried with the lead() function, but the problem is that I don’t know in advance how many records I will need to consolidate.

Suppose I can get the data in the form below:

 +----------+--------------+---------------+---------+---------+---------+ | employee | effective_dt | expiration_dt | col_a | col_b | col_c | +----------+--------------+---------------+---------+---------+---------+ | 0001 | 12/1/2012 | 12/4/2012 | value_a | value_a | value_a | | 0001 | 12/5/2012 | 12/6/2012 | value_a | value_a | value_a | | 0001 | 12/7/2012 | 12/10/2012 | value_a | value_a | value_a | | 0001 | 12/11/2012 | 12/17/2012 | value_a | value_b | value_a | | 0001 | 12/18/2012 | 12/31/2012 | value_a | value_a | value_a | +----------+--------------+---------------+---------+---------+---------+ 

Expected Result:

 +----------+--------------+---------------+---------+---------+---------+ | employee | effective_dt | expiration_dt | col_a | col_b | col_c | +----------+--------------+---------------+---------+---------+---------+ | 0001 | 12/1/2012 | 12/10/2012 | value_a | value_a | value_a | | 0001 | 12/11/2012 | 12/17/2012 | value_a | value_b | value_a | | 0001 | 12/18/2012 | 12/31/2012 | value_a | value_a | value_a | +----------+--------------+---------------+---------+---------+---------+ 

attempt 1:

 SELECT employee, MIN(effective_dt), MAX(expiration_dt), col_a, col_b, col_c FROM (SELECT employee, effective_dt, ... FROM table_x, table_y, ... where... ) table_a GROUP BY employee, col_a, col_b, col_c; 

attempt 2:

 SELECT employee, effective_dt, lead(expiration_dt, 1) over (partition BY employee, col_a, col_b, col_c order by effective_dt) expiration_dt, col_a, col_b, col_c FROM (SELECT employee, effective_dt, ... FROM table_x, table_y, ... where... ) table_a; 

Thank you!

+4
source share
1 answer

If we can safely assume that the next record is equal to the previous record + 1 day, we can link these records up:

SQL Fiddle

Oracle 11g R2 schema setup :

 CREATE TABLE t (employee int, effective_dt timestamp, expiration_dt timestamp, col_a varchar2(7), col_b varchar2(7), col_c varchar2(7)) ; INSERT ALL INTO t (employee, effective_dt, expiration_dt, col_a, col_b, col_c) VALUES (0001, '01-Dec-2012 12:00:00 AM', '04-Dec-2012 12:00:00 AM', 'value_a', 'value_a', 'value_a') INTO t (employee, effective_dt, expiration_dt, col_a, col_b, col_c) VALUES (0001, '05-Dec-2012 12:00:00 AM', '06-Dec-2012 12:00:00 AM', 'value_a', 'value_a', 'value_a') INTO t (employee, effective_dt, expiration_dt, col_a, col_b, col_c) VALUES (0001, '07-Dec-2012 12:00:00 AM', '10-Dec-2012 12:00:00 AM', 'value_a', 'value_a', 'value_a') INTO t (employee, effective_dt, expiration_dt, col_a, col_b, col_c) VALUES (0001, '11-Dec-2012 12:00:00 AM', '17-Dec-2012 12:00:00 AM', 'value_a', 'value_b', 'value_a') INTO t (employee, effective_dt, expiration_dt, col_a, col_b, col_c) VALUES (0001, '18-Dec-2012 12:00:00 AM', '19-Dec-2012 12:00:00 AM', 'value_a', 'value_b', 'value_a') INTO t (employee, effective_dt, expiration_dt, col_a, col_b, col_c) VALUES (0001, '20-Dec-2012 12:00:00 AM', '31-Dec-2012 12:00:00 AM', 'value_a', 'value_a', 'value_a') SELECT * FROM dual ; 

Request 1 :

 select employee, min(effective_dt), max(expiration_dt), col_a, col_b, col_c from ( select t.*, case when col_a = lag(col_a) over (partition by employee order by expiration_dt asc) and col_b = lag(col_b) over (partition by employee order by expiration_dt asc) and col_c = lag(col_c) over (partition by employee order by expiration_dt asc) then 0 else 1 end start_of_chain from t ) connect by effective_dt = prior expiration_dt + 1 and start_of_chain = 0 start with start_of_chain = 1 group by connect_by_root(effective_dt), employee, col_a, col_b, col_c order by 2 

Results :

 | EMPLOYEE | MIN(EFFECTIVE_DT) | MAX(EXPIRATION_DT) | COL_A | COL_B | COL_C | -------------------------------------------------------------------------------------------------------------- | 1 | December, 01 2012 00:00:00+0000 | December, 10 2012 00:00:00+0000 | value_a | value_a | value_a | | 1 | December, 11 2012 00:00:00+0000 | December, 19 2012 00:00:00+0000 | value_a | value_b | value_a | | 1 | December, 20 2012 00:00:00+0000 | December, 31 2012 00:00:00+0000 | value_a | value_a | value_a | 
+1
source

All Articles