Is it nice to have a non-clustered index containing a primary key from a clustered index?

If you have a table with a clustered primary key index (int), is it redundant and bad to have one (even more) nonclustered indexes that include this primary key column as one of the columns in a non-clustered index?

+7
sql-server indexing clustered-index
source share
4 answers

In fact, there may be good reasons for creating a nonclustered index that is identical to a clustered one. The reason is that clustered indexes carry baggage of row data, and this can lead to very low row density. I.e. you may have 2-3 lines per page due to large fields that are not in the cluster key, but the cluster index key is, say, 20 bytes. The presence of a non-clustered index for exactly the same key (s) and order, since a clustered index will give a density of 2-3 hundreds of keys per page. Many aggregate queries typical of an OLAP / BI workload can be more efficiently answered by a non-clustered index, simply because it reduces I / O by hundreds of times.

As for non-clustered indexes that contain parts of a cluster key or even the same keys, but in a different order, all bets are disabled because they can obviously be used for many queries.

So, the answer to your question: "Depends."

For a more accurate answer, you will need to share the exact layout of your tables and exact queries.

+14
source share

Yes, this is usually not necessary, because clustered index columns are already added to each index record in a non-clustered index.

Why? The value of the clustered key is what really allows SQL Server to β€œfind” a row of data - it is a "pointer" to the actual data - therefore, bypassed, it should be stored in a non-clustered index. If you watched β€œSmith, John” and you need to know more about this person, you need to go to the actual data β†’, and this is done by including the value of the clustering key in the node index nonclustered index.

This clustered key value already exists and, as a rule, it is redundant and you do not need to add this value again, explicitly, to your nonclustered index. This is bad because it just spends space without giving you any benefit.

+4
source share

I'm with Remus on this - the clustered index is not really an index - it tells you how the data is organized on the pages. (In your case, this is also the primary key, but it does not have to be the same). Nonclustered indexes include this string locator information, so yes, it is redundant.

But , if the non-clustered index covering and does not need to use the tab of the data row, it can be used more efficiently than the clustered index, and the efficiency increases as the ratio of the size of the data row to the size of the non-clustered index increases.

I found that if you have a good descriptor of access paths in the query workload, sometimes you can sometimes use several selective covering non-clustered indexes to completely eliminate the choice of clusters - heap table, PC and some good non-clustered indexes, and you're done.

+2
source share

There is no 100% answer, but the answer is almost certain.

Other indexes help to help with joins and sorting (usually). Given that the primary key is already indexed, if the optimizer can join it, it will use it.

If a different index from a join / sort perspective is required, what additional help does PK do in the index package? If before he could not join the PC, this will not happen. And this will not help with sorting.

0
source share

All Articles