How to combine / split multiple date periods into one timeline (Oracle 11g)?

I struggled with this problem for several days, and now I am turning to the masses for help.

My problem is similar, but not quite the same as the previous solution on this site: PL / SQL Split, add the date to the new dates according to the maturity dates! This solution is pretty logical (include / exclude), while my problem includes some of them, as well as merging.

Although I like to think that I have an intermediate / advanced understanding of SQL + PL / SQL ... Oracle Analytic functions seem to scare my mind. I tried to read / study, but my time is running out.

Since I'm not sure about the legality of exchanging table names (COTS), business directions, etc., I am going to imitate my problem with a vague script / context. I hope this affects the spirit of the lawyer.

To the problem: I have a table that stores the history of the client. A client can come and go, and therefore we can have several rows in this table (for each client).

CREATE TABLE activity AS SELECT 1 AS cust_id, TO_DATE('01-JAN-2010') AS start_dt, TO_DATE('31-JUL-2010') AS end_dt, 'EAST' AS region FROM DUAL UNION SELECT 1 AS cust_id, TO_DATE('01-FEB-2011') AS start_dt, TO_DATE('31-DEC-2011') AS end_dt, 'EAST' AS region FROM DUAL; 

I also have a table that contains information about attributes by range. A client can have several types of attributes at the same time and each type several times for different time intervals.

 CREATE TABLE attrib AS SELECT 1 AS cust_id, 'POWER' AS atb_cd, TO_DATE('01-JAN-2009') AS atb_start_dt, TO_DATE('31-JAN-2010') AS atb_end_dt, 'LocalNuke' AS provider, 1.80 AS per_kwh, 0 AS per_gal FROM DUAL UNION SELECT 1 AS cust_id, 'POWER' AS atb_cd, TO_DATE('01-MAR-2010') AS atb_start_dt, TO_DATE('31-MAR-2010') AS atb_end_dt, 'CoalGuys' AS provider, 1.60 AS per_kwh, 0 AS per_gal FROM DUAL UNION SELECT 1 AS cust_id, 'POWER' AS atb_cd, TO_DATE('01-JUN-2010') AS atb_start_dt, TO_DATE('30-SEP-2010') AS atb_end_dt, 'LocalNuke' AS provider, 1.70 AS per_kwh, 0 AS per_gal FROM DUAL UNION SELECT 1 AS cust_id, 'POWER' AS atb_cd, TO_DATE('01-MAR-2011') AS atb_start_dt, TO_DATE('31-DEC-9999') AS atb_end_dt, 'GeoHeat' AS provider, 1.10 AS per_kwh, 0 AS per_gal FROM DUAL UNION SELECT 1 AS cust_id, 'WATER' AS atb_cd, TO_DATE('01-MAR-2010') AS atb_start_dt, TO_DATE('31-DEC-9999') AS atb_end_dt, 'GlacialGold' AS provider, 0 AS per_kwh, 0.60 AS per_gal FROM DUAL; 

Intentional data habits, I tried to make this scenario as real as I could without contacting the "real world."

The result should restrain customer activity with the help of this fictitious company and split all overlapping dates to form a timeline. Data elements must be combined for reporting.

Visually:

 Cust: |----------------------| |------------------------| Power: |-------------| |--| |-------| |----------------------> Water: |------------------------------------------------------> Expected Result: |----|----|--|----|----| |----|-------------------| 

The solution must be scalable to include other attributes. In the end, I would get this denormalized information in a table so that I can report customer data at any given time. For example, if they had activity, power and water on a specific day; I should be able to export per_kwh, per_gal and activity data for this day.

Output example (tabular):

 CUST_ID FROM_DT THRU_DT REGION POWER_PROVIDER WATER_PROVIDER PER_KWH PER_GAL ------- ----------- ----------- ------ -------------- -------------- ------- ------- 1 01-JAN-2010 31-JAN-2010 EAST LocalNuke 1.80 0 1 01-FEB-2010 28-FEB-2010 EAST 0 0 1 01-MAR-2010 31-MAR-2010 EAST CoalGuys GlacialGold 1.60 0.60 1 01-APR-2010 31-MAY-2010 EAST GlacialGold 0 0.60 1 01-JUN-2010 31-JUL-2010 EAST LocalNuke GlacialGold 1.70 0.60 1 01-FEB-2011 28-FEB-2011 EAST GlacialGold 0 0.60 1 01-MAR-2011 31-DEC-2011 EAST GeoHeat GlacialGold 1.10 0.60 

I wrote something about 2 years ago (when this requirement looked like just Activity / Power) using 2 asynchronous cursors that handle slow deceleration (line by line).

While performance is important, the biggest reason I'm trying to find a direct / bulk sql solution is maintenance. The nested if / else cursor of my original solution is already difficult to execute and will be exponentially worse with at least two more β€œattribute” intervals for separation.

I would be grateful for any help that any of you can provide.

+4
source share
2 answers

It might work. He does not bring adjacent areas together, but he still has to do his job.

 WITH milestone AS ( SELECT cust_id, start_dt AS point_in_time FROM ACTIVITY UNION SELECT cust_id, atb_start_dt AS point_in_time FROM ATTRIB UNION SELECT cust_id, LEAST(end_dt, TO_DATE('30-DEC-9999')) + 1 AS point_in_time FROM ACTIVITY UNION SELECT cust_id, LEAST(atb_end_dt, TO_DATE('30-DEC-9999')) + 1 AS point_in_time FROM ATTRIB ) SELECT milestone.cust_id AS cust_id, milestone.point_in_time AS from_dt, LEAD(point_in_time) OVER (PARTITION BY milestone.cust_id ORDER BY milestone.point_in_time) - 1 AS thru_dt, activity.region AS region, power_attrib.provider AS power_provider, water_attrib.provider AS water_provider, COALESCE(power_attrib.per_kwh, 0) AS per_kwh, COALESCE(water_attrib.per_gal, 0) AS per_gal FROM MILESTONE LEFT OUTER JOIN ACTIVITY ON milestone.cust_id = activity.cust_id AND milestone.point_in_time BETWEEN activity.start_dt AND activity.end_dt LEFT OUTER JOIN ATTRIB power_attrib ON milestone.cust_id = power_attrib.cust_id AND power_attrib.atb_cd = 'POWER' AND milestone.point_in_time BETWEEN power_attrib.atb_start_dt AND power_attrib.atb_end_dt LEFT OUTER JOIN ATTRIB water_attrib ON milestone.cust_id = water_attrib.cust_id AND water_attrib.atb_cd = 'WATER' AND milestone.point_in_time BETWEEN water_attrib.atb_start_dt AND water_attrib.atb_end_dt 
+1
source

This is a really very risky problem, and I expect you to have a big messy request. The main problem you are facing is creating psudeo strings for spaces in the attribute table. This is problematic.

I took an abridged version of your problem, just trying to create spaces for the POWER attribute. I realized that each attribute line could be preceded by a space. With this

 SELECT PS.cust_id , G.is_gap , DECODE( G.is_gap, 'Y', PS.prev_start, PS.atb_start_dt ) AS start_date , DECODE( G.is_gap, 'Y', PS.prev_end, PS.atb_end_dt ) AS end_date , DECODE( G.is_gap, 'Y', NULL, PS.provider ) AS provider , DECODE( G.is_gap, 'Y', NULL, PS.per_kwh ) AS per_kwh , DECODE( G.is_gap, 'Y', NULL, PS.per_gal ) AS per_gal FROM ( SELECT P.cust_id , P.atb_start_dt , P.atb_end_dt , P.provider , P.per_kwh , P.per_gal , P.atb_start_dt - 1 AS prev_end , NVL( MAX( P.atb_end_dt ) OVER ( ORDER BY P.atb_end_dt ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING ) + 1 , '01-JAN-1900' ) AS prev_start FROM attrib P WHERE P.atb_cd = 'POWER' ) PS , ( SELECT DECODE(LEVEL,1,'Y','N') AS is_gap FROM DUAL CONNECT BY LEVEL <= 2 ) G WHERE ( PS.prev_end > PS.prev_start OR G.is_gap = 'N' ) ORDER BY 3 / 

Give me these results

 CUST_ID I START_DATE END_DATE PROVIDER PER_KWH PER_GAL ------- - ---------- ---------- ----------- ------- ------- 1 Y 01-JAN-00 31-DEC-08 1 N 01-JAN-09 31-JAN-10 LocalNuke 1.8 0 1 N 01-FEB-10 31-MAR-10 CoalGuys 1.6 0 1 Y 01-APR-10 31-MAY-10 1 N 01-JUN-10 30-SEP-10 LocalNuke 1.7 0 1 Y 01-OCT-10 28-FEB-11 1 N 01-MAR-11 31-DEC-99 GeoHeat 1.1 0 

Some notes:

  • I think line 5 of your sample results has an invalid end date. Should it be 31-JUL-2010 , because this is when activity ends?
  • I updated CoalGuys start CoalGuys to 01-FEB-2010 to check when there was no space.
  • It will be screwed up if there is no activity that runs into the distant future, because it does not create a backlog, but only the previous one. Perhaps always UNION alone, I think
  • It’s better not to use 9999 as the year you get errors if you try to add something to it. In the end, it didn't matter, but it was a hustle if you went to lagging spaces.

Now this is far from a complete solution, as soon as you throw the client and give water, it becomes more and more dirty. But you will probably need the guts of the above as an inline representation for inclusion in the main query. Then you will need to do the same for WATER. You will then have to join them along with date range checks, and then use LEAST and GREATEST for the final results.

Sorry, after 40 or so minutes that I spent on this, he got away from an interesting problem, feeling like work, so leave my answer incomplete. Hope this helps.

+1
source

All Articles