Using DISTINCT with GROUP BY in SQL Server

Is there any purpose for using both DISTINCT and GROUP BY in SQL?

Below is a sample code

SELECT DISTINCT Actors FROM MovieDetails GROUP BY Actors 

Does anyone know of any situations where you need to use both DISTINCT and GROUP BY to get any specific desired results?

(Usually used separately using DISTINCT and GROUP BY)

+7
sql-server group-by distinct
source share
2 answers

Use DISTINCT to remove a duplicate GROUPING SETS from a GROUP BY

In a completely stupid example, generally using GROUPING SETS() (or special grouping groups ROLLUP() or CUBE() ), you can use DISTINCT to again remove duplicate values ​​created by grouping sets

 SELECT DISTINCT actors FROM (VALUES('a'), ('a'), ('b'), ('b')) t(actors) GROUP BY CUBE(actors, actors) 

With DISTINCT :

 actors ------ NULL a b 

Without DISTINCT :

 actors ------ a b NULL a b a b 

But why, besides the academic point of view, would you do it?

Use DISTINCT to find unique values ​​of an aggregate function

In a less far-fetched example, you might be interested in aggregated DISTINCT values, for example, how many different duplicate member numbers exist?

 SELECT DISTINCT COUNT(*) FROM (VALUES('a'), ('a'), ('b'), ('b')) t(actors) GROUP BY actors 

Answer:

 count ----- 2 

Use DISTINCT to remove duplicates with more than one GROUP BY column

Another case, of course, is the following:

 SELECT DISTINCT actors, COUNT(*) FROM (VALUES('a', 1), ('a', 1), ('b', 1), ('b', 2)) t(actors, id) GROUP BY actors, id 

With DISTINCT :

 actors count ------------- a 2 b 1 

Without DISTINCT :

 actors count ------------- a 2 b 1 b 1 

For more details, I wrote several blog posts, for example. about GROUPING SETS and how they affect the GROUP BY operation , or about the logical order of SQL operations (as opposed to the lexical order of operations) .

+8
source share

Maybe not in the context that you have, but you can use

 SELECT DISTINCT col1, PERCENTILE_CONT(col2) WITHIN GROUP (ORDER BY col2) OVER (PARTITION BY col1), PERCENTILE_CONT(col2) WITHIN GROUP (ORDER BY col2) OVER (PARTITION BY col1, col3), FROM TableA 

You would use this to return different levels of aggregation returned on a single line. The use case will be that for one group all the necessary units will not be enough.

0
source share

All Articles