SQL String Return Code

I rarely used SQL rarely until recently, when I started using it daily. I notice that if the "order by" clause is not used:

  • When you select a part of the table, the returned rows are displayed in the same order as they are, if I select the entire table
  • The order of the rows returned by the selection from the join seems to be determined by the left most members of the join.

Is this behavior the standard thing that you can count on in the most common databases (MySql, Oracle, PostgreSQL, Sqlite, Sql Server)? (I donโ€™t even know if you can really count on it in sqlite). How strictly is it enforced if this is the case (for example, if you use a "group for", will individual groups have this order)?

+7
source share
5 answers

Section 20.2. Direct Selection Operator: Multiple Lines>, General Rules Subsection SQL-92 Specification :

  4) If an <order by clause> is not specified, then the ordering of
    the rows of Q is implementation-dependent.
+11
source

If the ORDER BY clause is not included in the query, the returned row order is undefined.

While some RDBMSes return rows in specific orders in some situations, even if the ORDER BY clause is omitted, this behavior should never be relied upon.

+13
source

If you want to order, turn on ORDER BY . If you did not specify ORDER BY , you tell SQL Server:

I don't care what order you return, just return the lines

Since you donโ€™t care, SQL Server will decide how to return the rows, which it considers the most efficient way right now (or the last time a plan was programmed for this particular query). Therefore, you should not rely on the behavior that you observe. It can change from one query launch to another, with data changes, statistics changes, index changes, service packs, cumulative updates, updates, etc. Etc. Etc.

+9
source

For PostgreSQL, if you omit the ORDER BY , you can execute the same query 100 times until the database is modified and get one pass in the middle in a different order than the others. In fact, each run can be in a different order.

One of the reasons why this can happen is that if the selected plan includes sequential scanning of the table heap and there is already a seqscan of this table heap, your query will start scanning it at any point where another scan is already on, to reduce need disk access.

As the other answers pointed out, if you want the data in a specific order, specify this order. PostgreSQL takes into account the requested order when choosing a plan and can use an index that provides data in that order if it works cheaper than getting rows in another way and then sorting them.

GROUP BY does not guarantee any guarantees; PostgreSQL can sort data for grouping or use a hash table and return rows in the order of the amount generated by the hash algorithm (i.e., rather random). And that can change from one run to another.

+3
source

I never ceased to amaze me when I was a database administrator that this SQL feature was so often considered fancy. Consider a simple program that works with a text file and produces some output. If the program never changes and the data never changes, you expect the output to never change.

Regarding this:

If the ORDER BY clause is not included in the query, the returned row order is undefined.

Not strictly true - on every DBMS I have ever worked with (Oracle, Informix, SQL Server, DB2, to name a few), the DISTINCT clause also has the same effect as ORDER BY, since the search for unique values โ€‹โ€‹includes sorting by definition .

EDIT (6/2/14):

Create a simple table

enter image description here

For DISTINCT and ORDER BY, both plans and costs are the same, because supposedly the same operation is performed

enter image description here

enter image description here

And it is not surprising that the effect is thus the same

enter image description here

-one
source

All Articles