Why is SQL Server 2008 ordering using GROUP BY and no ordering specified?

Question

Why is SQL Server 2008 ordering using GROUP BY and no ordering specified?

I have a very strange problem about which I have not yet found an explanation. With SQL Server 2008 and with GROUP BY, it arranges my columns without specifying ORDER BY. Here is an example script that demonstrates the situation.

CREATE TABLE #Values ( FieldValue varchar(50) ) ;WITH FieldValues AS ( SELECT '4' FieldValue UNION ALL SELECT '3' FieldValue UNION ALL SELECT '2' FieldValue UNION ALL SELECT '1' FieldValue ) INSERT INTO #Values ( FieldValue ) SELECT FieldValue FROM FieldValues -- First SELECT demonstrating they are ordered DESCENDING SELECT FieldValue FROM #Values -- Second SELECT demonstrating they are ordered ASCENDING SELECT FieldValue FROM #Values GROUP BY FieldValue DROP TABLE #Values

The first SELECT will return

 4 3 2 1

The second SELECT will return

 1 2 3 4

According to the MSDN documentation , it says: “GROUP BY clause does not streamline result set”

+4

sql sql-server-2008 sql-order-by group-by

Nathan palmer Oct 14 '10 at 22:56

source share

4 answers

If you do not specify an order by clause, SQLServer can return the results in any order.

Perhaps in your particular query it will return results, but this does not mean that when using the group by clause, res will ~~always~~ be sorted .

Maybe (maybe), it makes some hash aggregate to compute the group, and the hash table is sorted by 1,2,3,4. And then ir returning the lines in a hash order ...

+6

Pablo santa cruz Oct 14 '10 at 22:59

source share

This is the facet of database optimization. SQL Engine usually needs to analyze data based on an ordered column. Instead of wasting time after this, it leaves the data set in any order processed.

There are situations when this is not so (especially if you use aggregate functions). Using the ORDER BY command explicitly redefines this functionality (and, therefore, implies a small additional load, which is not enough to worry in all but the most extreme cases).

+1

Maltronic Oct 14 '10 at 23:07

source share

You can also clear your insert instructions. SQL Server 2008 has added a new feature for inserts. Example:

Inert in tbl (clmn) Values (1), (2), (3) Each record does not need its own insert statement, and each of them is separated by a coma.

+1

Salizar marxx Oct 15 '10 at 4:07

source share

Tadmas · Accepted Answer · 2010-10-14T23:04:23+0000

To answer this question, review the query plans prepared by both.

The first SELECT is a simple table scan, which means that it creates the rows in the order of distribution. Since this is a new table, it corresponds to the order in which you inserted records.

The second SELECT adds GROUP BY, which SQL Server implements through a separate view, since row counting is so low. If you have more rows or add an aggregate to your SELECT, this statement may change.

For example, try:

 CREATE TABLE #Values ( FieldValue varchar(50) ) ;WITH FieldValues AS ( SELECT '4' FieldValue UNION ALL SELECT '3' FieldValue UNION ALL SELECT '2' FieldValue UNION ALL SELECT '1' FieldValue ) INSERT INTO #Values ( FieldValue ) SELECT A.FieldValue FROM FieldValues A CROSS JOIN FieldValues B CROSS JOIN FieldValues C CROSS JOIN FieldValues D CROSS JOIN FieldValues E CROSS JOIN FieldValues F SELECT FieldValue FROM #Values GROUP BY FieldValue DROP TABLE #Values

Due to the number of lines, this changes to a hash aggregate, and now there is no sorting in the query plan.

Without ORDER BY, SQL Server can return results in any order, and the order in which it is returned is a side effect of how it believes that it can return data most quickly.

Why is SQL Server 2008 ordering using GROUP BY and no ordering specified?

More articles: