Multiple T-SQL Grouping

I have the following data:

Product Price StartDate EndDate Apples 4.9 2010-03-01 00:00:00.000 2010-03-01 00:00:00.000 Apples 4.9 2010-03-02 00:00:00.000 2010-03-02 00:00:00.000 Apples 2.5 2010-03-03 00:00:00.000 2010-03-03 00:00:00.000 Apples 4.9 2010-03-05 00:00:00.000 2010-03-05 00:00:00.000 Apples 4.9 2010-03-06 00:00:00.000 2010-03-06 00:00:00.000 Apples 4.9 2010-03-09 00:00:00.000 2010-03-09 00:00:00.000 Apples 2.5 2010-03-10 00:00:00.000 2010-03-10 00:00:00.000 Apples 4.9 2010-03-11 00:00:00.000 2010-03-11 00:00:00.000 Apples 4.9 2010-03-12 00:00:00.000 2010-03-12 00:00:00.000 Apples 4.9 2010-03-13 00:00:00.000 2010-03-13 00:00:00.000 Apples 4.9 2010-03-15 00:00:00.000 2010-03-15 00:00:00.000 Apples 4.9 2010-03-16 00:00:00.000 2010-03-16 00:00:00.000 

you want to group as product, price, min(startdate), max(startdate) , but you should have a grouping both in the start date and at the end of the date ........ something like below

Desired Result

 Apples 4.9 2010-03-01 00:00:00.000 2010-03-02 00:00:00.000 Apples 2.5 2010-03-03 00:00:00.000 2010-03-03 00:00:00.000 Apples 4.9 2010-03-05 00:00:00.000 2010-03-09 00:00:00.000 Apples 2.5 2010-03-10 00:00:00.000 2010-03-10 00:00:00.000 Apples 4.9 2010-03-11 00:00:00.000 2010-03-16 00:00:00.000 
+4
source share
5 answers

My approach.

Data:

 create table t ( producte varchar(50), price money, start_date date, end_date date); insert into t values ( 'apple', 4.9, '2012-01-01', '2012-01-01' ), ( 'apple', 4.9, '2012-01-02', '2012-01-02' ), ( 'apple', 8, '2012-01-04', '2012-01-04' ), ( 'cat', 5, '2012-01-01', '2012-01-01' ), ( 'cat', 6, '2012-01-02', '2012-01-02' ), ( 'cat', 6, '2012-01-03', '2012-01-03' ); 

Query:

 with start_dates as ( select t.producte, t.price, t.start_date, t.end_date, t.start_date as gr_date from t left outer join t t1 on t.price = t1.price and --new t.producte = t1.producte and t.start_date = dateadd(day,1, t1.end_date ) where t1.producte is null union all select t.producte, t.price, t.start_date,t. end_date, gr_date from t inner join start_dates t1 on t.price = t1.price and --new t.producte = t1.producte and t.start_date = dateadd(day,1, t1.end_date ) ) select t.producte, t.price , min( t.start_date ), max( t.end_date ) from start_dates t group by t.producte, gr_date ,t.price 

Results :

 | PRODUCTE | PRICE | COLUMN_2 | COLUMN_3 | ---------------------------------------------- | apple | 4.9 | 2012-01-01 | 2012-01-02 | | apple | 8 | 2012-01-04 | 2012-01-04 | | cat | 5 | 2012-01-01 | 2012-01-01 | | cat | 6 | 2012-01-02 | 2012-01-03 | 

Explanation

This is a recursive expression of CTE. The base query takes into account the effective dates for each price group. A recursive query looks for the latest data with this price.

+3
source
 SELECT product, price, MIN(start_date), MAX(end_date) FROM ( SELECT product, price, start_date, end_date, ROW_NUMBER() OVER (PARTITION BY product ORDER BY startDate) rn1, ROW_NUMBER() OVER (PARTITION BY product, price ORDER BY startDate) rn2 FROM mytable ) q GROUP BY product, price, rn2 - rn1 ORDER BY product, MIN(start_date), price 
+3
source

Here is the SQLFiddle demo

 with t2 as ( select t1.*, (select count(Price) from t where startdate<t1.startdate and Price<>t1.price and Product=t1.Product ) rng from t as t1 ) select Product,Price,min(startDate),max(EndDate) from t2 group by Product,Price,RNG order by 3 
+3
source

Here is the suggestion: for each row, you should find the maximum previous date for which the price is different, and you are Group on this. For example, for any line between 2010-03-11 and 2010-03-16 you should get the date 2010-03-10, because this is the maximum previous date for which the price is different (2.5 against 4.9). The first line (s) will return a null date, but this should not be a problem.

However, for a very long table, such a query can become very slow. Therefore, if you have performance problems, you should study the possibility of adding a column and use the cursor to fill it in stages: you look at it by date and every time you see a new price, you change its value. The final grouping is then trivial.

Here is something:

 Select Product, Price, Min(StartDate) as StartDate, PreviousDate from ( Select product, price, StartDate, (Select max (StartDate) from table_2 t3 where t3.price <> t2.price and t3.StartDate < t2.StartDate and t3.Product = t2.Product) as previousDate from table_2 t2) SQ Group by Product, Price, PreviousDate Order by PreviousDate 
+1
source

I believe this is the most effective solution:

 WITH Calc AS ( SELECT *, Grp = DateAdd(day, -Row_Number() OVER (PARTITION BY Product, Price ORDER BY StartDate), StartDate ) FROM dbo.PriceHistory ) SELECT Product, Price, FromDate = Min(StartDate), ToDate = Max(StartDate) FROM Calc GROUP BY Product, Price, Grp ORDER BY FromDate; 

Try it yourself

0
source

All Articles