Why rotation with extra columns doesn't combine results

I know that many of you have observed this behavior, but I wonder if anyone can explain why. When I create a small table to create an example of using the rotation function, I get the results that I would expect:

CREATE TABLE dbo.AverageFishLength ( Fishtype VARCHAR(50) , AvgLength DECIMAL(8, 2) , FishAge_Years INT ) INSERT INTO dbo.AverageFishLength ( Fishtype, AvgLength, FishAge_Years ) VALUES ( 'Muskie', 32.75, 3 ), ( 'Muskie', 37.5, 4 ), ( 'Muskie', 39.75, 5 ), ( 'Walleye', 16.5, 3 ), ( 'Walleye', 18.25, 4 ), ( 'Walleye', 20.0, 5 ), ( 'Northern Pike', 20.75, 3 ), ( 'Northern Pike', 23.25, 4 ), ( 'Northern Pike', 26.0, 5 ); 

Here is a summary request:

 SELECT Fishtype , [3] AS [3 Years Old] , [4] AS [4 Years Old] , [5] AS [5 Years Old] FROM dbo.AverageFishLength PIVOT( SUM(AvgLength) FOR FishAge_Years IN ( [3], [4], [5] ) ) AS PivotTbl 

Here are the results:

enter image description here

However, if I create a table with an identifier column, the results will be split into separate rows:

 DROP TABLE dbo.AverageFishLength CREATE TABLE dbo.AverageFishLength ( ID INT IDENTITY(1,1) , Fishtype VARCHAR(50) , AvgLength DECIMAL(8, 2) , FishAge_Years INT ) INSERT INTO dbo.AverageFishLength ( Fishtype, AvgLength, FishAge_Years ) VALUES ( 'Muskie', 32.75, 3 ), ( 'Muskie', 37.5, 4 ), ( 'Muskie', 39.75, 5 ), ( 'Walleye', 16.5, 3 ), ( 'Walleye', 18.25, 4 ), ( 'Walleye', 20.0, 5 ), ( 'Northern Pike', 20.75, 3 ), ( 'Northern Pike', 23.25, 4 ), ( 'Northern Pike', 26.0, 5 ); 

Exact exact request:

 SELECT Fishtype , [3] AS [3 Years Old] , [4] AS [4 Years Old] , [5] AS [5 Years Old] FROM dbo.AverageFishLength PIVOT( SUM(AvgLength) FOR FishAge_Years IN ( [3], [4], [5] ) ) AS PivotTbl 

Various results:

enter image description here

It seems to me that the identifier column is used in the request, although it does not appear at all in the request. This is almost like implicit inclusion in a query, but not shown in the result set.

Can someone explain why this is happening?

+8
sql sql-server tsql pivot
source share
1 answer

This is because the ID column is unique for each row, and since you directly query the table (without a subquery), this column is included as part of GROUP BY , an aggregate function is required.

Documents The MSDN documents about FROM indicate the following:

table_source PIVOT <pivot_clause>

Indicates that table_source is rotated based on pivot_column. table_source is an expression of a table or table. Output is a table containing all the columns of table_source, with the exception of pivot_column and value_column. The columns of the source_table, with the exception of pivot_column and value_column , are called rotation operator grouping columns .

PIVOT performs a grouping operation in the input table on grouping columns and returns one row for each group . In addition, the output contains one column for each value specified in the list_list column, which is displayed in the pivot_column column of the input_table.

Your version basically tells SELECT * FROM yourtable and PIVOT this data. Although the ID column is not in your final SELECT list, it is a grouping element in the query. If you compare PIVOT with the pre-PIVOT example to show you will see which version. This example uses a CASE expression and an aggregate function:

 SELECT Fishtype, sum(case when FishAge_Years = 3 then AvgLength else 0 end) as [3], sum(case when FishAge_Years = 4 then AvgLength else 0 end) as [4], sum(case when FishAge_Years = 5 then AvgLength else 0 end) as [5] FROM dbo.AverageFishLength GROUP BY Fishtype, ID; 

The result will be distorted because, despite the fact that the last list does not have an ID , it is still used for grouping and, since they are unique, you get a few lines.

The easiest way to solve this problem when using PIVOT is to use a subquery:

 SELECT Fishtype , [3] AS [3 Years Old] , [4] AS [4 Years Old] , [5] AS [5 Years Old] FROM ( SELECT Fishtype, AvgLength, FishAge_Years FROM dbo.AverageFishLength ) d PIVOT ( SUM(AvgLength) FOR FishAge_Years IN ( [3], [4], [5] ) ) AS PivotTbl; 

In this version, you return only those columns that you really need and what you want from your table - this excludes the ID , so it will not be used to group your data.

+11
source share

All Articles