Normalizing data from a denormalized table

I have data in my table like this

RepID|Role|Status|StartDate |EndDate | -----|----|------|----------|----------| 10001|R1 |Active|01/01/2015|01/31/2015| -----|----|------|----------|----------| 10001|R1 |Leavee|02/01/2015|02/12/2015| -----|----|------|----------|----------| 10001|R1 |Active|02/13/2015|02/28/2015| -----|----|------|----------|----------| 10001|R2 |Active|03/01/2015|03/18/2015| -----|----|------|----------|----------| 10001|R2 |Leave |03/19/2015|04/10/2015| -----|----|------|----------|----------| 10001|R2 |Active|04/11/2015|05/10/2015| -----|----|------|----------|----------| 10001|R1 |Active|05/11/2015|06/13/2015| -----|----|------|----------|----------| 10001|R1 |Leave |06/14/2015|12/31/9998| -----|----|------|----------|----------| 

I'm looking for such a conclusion

 RepID|Role|StartDate |EndDate | -----|----|----------|----------| 10001|R1 |01/01/2015|02/28/2015| -----|----|----------|----------| 10001|R2 |03/01/2015|05/10/2015| -----|----|----------|----------| 10001|R1 |05/11/2015|12/31/9998| -----|----|----------|----------| 

When only a role change occurs, I need to commit the start and EndDate. I tried different ways, but could not get the result.

Any help is appreciated.

Below is the SQL code, but it doesnโ€™t help,

 SELECT T1.RepID, T1.Role, Min(T1.StartDate) AS StartDate, Max(T1.EndDate) AS EndDate FROM (SELECT rD1.RepID, rD1.Role, rD1.StartDate, rD1.EndDate FROM repDetails rD1 INNER JOIN repDetails rD2 ON rD2.RepID = rD1.RepID AND rD2.StartDate = DateAdd (Day, 1, rD1.EndDate) AND (rD2.Role = rD1.Role OR (rD2.Role IS NULL AND rD1.Role IS NULL) OR (rD2.Role = '' AND rD1.Role = '')) UNION SELECT rD2.RepID, rD2.Role, rD2.StartDate, rD2.EndDate FROM repDetails rD1 INNER JOIN repDetails rD2 ON rD2.RepID = rD1.RepID AND rD2.StartDate = DateAdd (Day, 1, rD1.EndDate) AND (rD2.Role = rD1.Role OR (rD2.Role IS NULL AND rD1.Role IS NULL) OR (rD2.Role = '' AND rD1.Role = '')) ) T1 GROUP BY T1.RepID, T1.Role UNION SELECT EP.RepID, EP.Role AS DataValue, EP.StartDate, EP.EndDate FROM repDetails EP LEFT OUTER JOIN (SELECT rD1.RepID, rD1.Role, rD1.StartDate, rD1.EndDate FROM repDetails rD1 INNER JOIN repDetails rD2 ON rD2.RepID = rD1.RepID AND rD2.StartDate = DateAdd (Day, 1, rD1.EndDate) AND (rD2.Role = rD1.Role OR (rD2.Role IS NULL AND rD1.Role IS NULL) OR (rD2.Role = '' AND rD1.Role = '')) UNION SELECT rD2.RepID, rD2.Role , rD2.StartDate, rD2.EndDate FROM repDetails rD1 INNER JOIN repDetails rD2 ON rD2.RepID = rD1.RepID AND rD2.StartDate = DateAdd (Day, 1, rD1.EndDate) AND (rD2.Role = rD1.Role OR (rD2.Role IS NULL AND rD1.Role IS NULL) OR (rD2.Role = '' AND rD1.Role = '')) ) T1 ON EP.RepID = T1.RepID AND EP.StartDate = T1.StartDate WHERE T1.RepID IS NULL 
+7
sql sql-server
source share
2 answers

The key here is to identify continuous lines until the role changes. This can be done by comparing the next role of the line with the lead function and some additional logic to classify all previous lines into the same group.

After classifying them into groups, you just need to use min and max to get the start and end dates.

 with groups as ( select x.* ,case when grp = 1 then 0 else 1 end + sum(grp) over(partition by repid order by startdate) grps from (select t.* ,case when lead(role) over(partition by repid order by startdate) = role then 0 else 1 end grp from t) x ) select distinct repid,role ,min(startdate) over(partition by repid,grps) startdt ,max(enddate) over(partition by repid,grps) enddt from groups order by 1,3 

Sample demo

+2
source share

Do you just want the minimum (start) / maximum (end) dates for each repID and role? If yes, try:

 Select repID, role, min(starDate), max(endDate) from tbl group by repID, role 

- A more detailed solution equivalent to VKP's:

 SELECT repid, ROLE, grpID, MIN(startdate) AS min_startDateOverRole, MAX(endDate) AS max_endDateOverRole FROM (SELECT *, CASE WHEN isGrpEnd = 1 THEN 0 ELSE 1 end + -- when on group end row, don't increment grpID. -- Wait until start of next group SUM(isGrpEnd) OVER(ORDER BY startdate) grpID -- sum(all group end rows up to this one) FROM (SELECT *, CASE WHEN lead(ROLE) OVER(ORDER BY startdate) = ROLE THEN 0 ELSE 1 end isGrpEnd FROM t) x ) GROUP BY repid, ROLE, grpid ORDER BY 1,3 
0
source share

All Articles