SQL - subquery in aggregate function

I use the Northwind database to update my SQL skills by creating several more or less complex queries. Unfortunately, I could not find a solution for my last use case: "Get the sum of the five highest orders for each category in 1997."

Used tables:

Orders(OrderId, OrderDate) Order Details(OrderId, ProductId, Quantity, UnitPrice) Products(ProductId, CategoryId) Categories(CategoryId, CategoryName) 

I tried the following query

 SELECT c.CategoryName, SUM( (SELECT TOP 5 od2.UnitPrice*od2.Quantity FROM [Order Details] od2, Products p2 WHERE od2.ProductID = p2.ProductID AND c.CategoryID = p2.CategoryID ORDER BY 1 DESC)) FROM [Order Details] od, Products p, Categories c, Orders o WHERE od.ProductID = p. ProductID AND p.CategoryID = c.CategoryID AND od.OrderID = o.OrderID AND YEAR(o.OrderDate) = 1997 GROUP BY c.CategoryName 

Well ... It turned out that subqueries are not allowed in aggregate functions. I read other posts about this problem, but could not find a solution for my specific use case. I hope you help me ...

+7
source share
4 answers

Subqueries are usually not allowed in aggregate functions. Instead, move the aggregate inside the subquery. In this case, you will need an additional subquery level due to top 5 :

 SELECT c.CategoryName, (select sum(val) from (SELECT TOP 5 od2.UnitPrice*od2.Quantity as val FROM [Order Details] od2, Products p2 WHERE od2.ProductID = p2.ProductID AND c.CategoryID = p2.CategoryID ORDER BY 1 DESC ) t ) FROM [Order Details] od, Products p, Categories c, Orders o WHERE od.ProductID = p. ProductID AND p.CategoryID = c.CategoryID AND od.OrderID = o.OrderID AND YEAR(o.OrderDate) = 1997 GROUP BY c.CategoryName, c.CategoryId 
+20
source

Its definitely a problem with the additional request here is an excellent article about it (originally written for Access, but the syntax is identical), also orderdate = 1997 will indicate the order date on January 1, 1997 - you need datepart (year, orderdate) = 1997, after that as you have (up to five) rows returned for each category, you can encapsulate the returned rows and aggregate them

+3
source

Use a CTE with the ROW_NUMBER function instead of an excessive subquery.

  ;WITH cte AS ( SELECT c.CategoryName, od2.UnitPrice, od2.Quantity, ROW_NUMBER() OVER(PARTITION BY c.CategoryName ORDER BY od2.UnitPrice * od2.Quantity DESC) AS rn FROM [Order Details] od JOIN Products p ON od.ProductID = p.ProductID JOIN Categories c ON p.CategoryID = c.CategoryID JOIN Orders o ON od.OrderID = o.OrderID WHERE o.OrderDate >= DATEADD(YEAR, DATEDIFF(YEAR, 0, '19970101'), 0) AND o.OrderDate < DATEADD(YEAR, DATEDIFF(YEAR, 0, '19970101')+1, 0) ) SELECT CategoryName, SUM(UnitPrice * Quantity) AS val FROM cte WHERE rn < 6 GROUP BY CategoryName 
+3
source

I had a very similar problem with the Access subquery, where the entries are sorted by date. When I used the "Last" aggregate function, I found that it went through all the subqueries and retrieved the last row of data from the Access table, rather than the sorted query as intended. Although I could rewrite the query to use the aggregate function in the first set of brackets (as suggested earlier), it was easier for me to save the query results as a table in the database, sorted in the order I wanted, and then use the β€œLast” aggregate function to get the values ​​i wanted. I will run an update request in the future to keep current results. Ineffective, but effective.

0
source

All Articles