Primary Key Sorting

Is the table internally sorted by the primary key on it? If I have a table with a primary key in the BigInt authentication column, I can trust that queries will always return data sorted by key, or I explicitly need to add “ORDER BY”. The difference in performance is significant.

+6
sql sql-server indexing
source share
7 answers

Data is physically stored by a clustered index, which is usually the primary key, but not required.

Data in SQL cannot guarantee order without an ORDER BY clause. You should always specify an ORDER BY clause when you need data in a specific order. If the table is already sorted in this way, the optimizer will not do any additional work, so there is no harm in its availability.

Without a clause, ORDER BY RDBMS can return cached pages matching your query while it expects to record from disk. In this case, even if the table has an index, the data may not enter the index order. (Note that this is just an example - I don’t know and don’t even think that a real RDBMS will do this, but this is acceptable behavior for implementing SQL.)

EDIT

If you have a performance impact on sorting compared to sorting, you are probably sorting by column (or set of columns) that has no index (clustered or otherwise). Given that this is a time series, you can sort by time, but the clustered index is on the primary version of bigint. SQL Server does not know that both increase the same, so it should resort to all.

If the time column and the primary key column are connected in order (one increases if and only if the other increases or remains unchanged), sort by the primary key. If they are not related this way, move the clustered index from the primary key to any column (s) that you sort.

+11
source share

Without an explicit ORDER BY, there is no default sort order. A very common question. So there is a canned answer:

Without ORDER BY, there is no default sort order.

Can you explain why "performance difference is important."

+2
source share

The table is not "clustered" by default, i.e. organized by pc. You have the opportunity to specify it as such. Thus, the default is “HEAP” (in a specific order), and the parameter you are looking for is “CLUSTERED” (SQL Server, in Oracle, IOT).

  • There can only be one CLUSTERED in the table (it makes sense)
  • Use PRIMARY KEY CLUSTERED syntax for DDL
  • An order for PK should still be issued on your SELECTS, the fact of clustering it will lead to the fact that the request will work faster, since the optimizer plan will know that it does not need to sort by cluster index

An earlier poster is correct, SQL (and the theoretical foundation) specifically defines the selection as an unordered set / tuple.

Typically, SQL tries to stay in the logical domain and not make assumptions about physical organization / locations, etc. data. The CLUSTERED option allows us to do this for practical, real-life situations.

+1
source share

You must apply ORDER BY to guarantee the order. If you notice a difference in performance, then most likely your data was not sorted without ORDER BY in place - otherwise the SQL server should behave badly, since it does not understand that the data is already sorted. Adding ORDER BY to already sorted data should not result in poor performance because the RDBMS must be smart enough to implement the data order.

+1
source share

In SQL Server: there is no <key> clustering , which is used by default for the primary key, but does not have to be the same.

The main function of the main key is to uniquely identify each row in the table, but it does not imply any (physical) sorting as such.

Not sure about other database systems.

Mark

0
source share

This may be implementation-specific, but MySQL sorts by primary key by default. However, anytime you need a guarantee that the rows will be sorted in a certain way, you must add ORDER BY.

0
source share

Almost every time it sorts Identity tables. It sorts by cluster index as it cannot always be sorted by identifier, but I never saw him not sorting the identifier identifier when selecting *. What is the reason not to indicate the order? I do not understand why this causes a difference in performance.

0
source share

All Articles