T-SQL for finding redundant indexes

Does anyone know of a T-SQL script that can detect redundant indexes across the entire database? An example of a redundant index in a table would be:

Index 1: 'ColumnA', 'ColumnB', 'ColumnC' Index 2: 'ColumnA', 'ColumnB' 

Ignoring other considerations, such as column widths and coverage indices, index 2 will be redundant.

Thanks.

+6
sql-server tsql indexing
source share
5 answers

There are situations when redundancy is not performed. For example, let's say ColumnC was a huuge field, but sometimes you had to quickly extract it. Your index 1 does not require a key lookup for:

 select ColumnC from YourTable where ColumnnA = 12 

index 2 other hand, is much smaller, so it can be read in memory for queries that require an index scan:

 select * from YourTable where ColumnnA like '%hello%' 

Therefore, they are not redundant.

If you are unsure of my argument above, you may find "redundant" indexes, for example:

 ;with ind as ( select a.object_id , a.index_id , cast(col_list.list as varchar(max)) as list from ( select distinct object_id , index_id from sys.index_columns ) a cross apply ( select cast(column_id as varchar(16)) + ',' as [text()] from sys.index_columns b where a.object_id = b.object_id and a.index_id = b.index_id for xml path(''), type ) col_list (list) ) select object_name(a.object_id) as TableName , asi.name as FatherIndex , bsi.name as RedundantIndex from ind a join sys.sysindexes asi on asi.id = a.object_id and asi.indid = a.index_id join ind b on a.object_id = b.object_id and a.object_id = b.object_id and len(a.list) > len(b.list) and left(a.list, LEN(b.list)) = b.list join sys.sysindexes bsi on bsi.id = b.object_id and bsi.indid = b.index_id 

Bring a cake for your users if the performance "unexpectedly" decreases :-)

+11
source share

Inspired by Paul Nielsen , I wrote this query to find / distinguish myself:

  • Duplicates (ignoring includes order)
  • Reserve (different columns)
  • Overlap (different index columns)

Also record their usage (You can also use is_descending_key , but I don't need it.)

 WITH IndexColumns AS ( SELECT I.object_id AS TableObjectId, OBJECT_SCHEMA_NAME(I.object_id) + '.' + OBJECT_NAME(I.object_id) AS TableName, I.index_id AS IndexId, I.name AS IndexName , (IndexUsage.user_seeks + IndexUsage.user_scans + IndexUsage.user_lookups) AS IndexUsage , IndexUsage.user_updates AS IndexUpdates , (SELECT CASE is_included_column WHEN 1 THEN NULL ELSE column_id END AS [data()] FROM sys.index_columns AS IndexColumns WHERE IndexColumns.object_id = I.object_id AND IndexColumns.index_id = I.index_id ORDER BY index_column_id, column_id FOR XML PATH('') ) AS ConcIndexColumnNrs ,(SELECT CASE is_included_column WHEN 1 THEN NULL ELSE COL_NAME(I.object_id, column_id) END AS [data()] FROM sys.index_columns AS IndexColumns WHERE IndexColumns.object_id = I.object_id AND IndexColumns.index_id = I.index_id ORDER BY index_column_id, column_id FOR XML PATH('') ) AS ConcIndexColumnNames ,(SELECT CASE is_included_column WHEN 1 THEN column_id ELSE NULL END AS [data()] FROM sys.index_columns AS IndexColumns WHERE IndexColumns.object_id = I.object_id AND IndexColumns.index_id = I.index_id ORDER BY column_id FOR XML PATH('') ) AS ConcIncludeColumnNrs ,(SELECT CASE is_included_column WHEN 1 THEN COL_NAME(I.object_id, column_id) ELSE NULL END AS [data()] FROM sys.index_columns AS IndexColumns WHERE IndexColumns.object_id = I.object_id AND IndexColumns.index_id = I.index_id ORDER BY column_id FOR XML PATH('') ) AS ConcIncludeColumnNames FROM sys.indexes AS I LEFT OUTER JOIN sys.dm_db_index_usage_stats AS IndexUsage ON IndexUsage.object_id = I.object_id AND IndexUsage.index_id = I.index_id AND IndexUsage.Database_id = db_id() ) SELECT C1.TableName , C1.IndexName AS 'Index1' , C2.IndexName AS 'Index2' , CASE WHEN (C1.ConcIndexColumnNrs = C2.ConcIndexColumnNrs) AND (C1.ConcIncludeColumnNrs = C2.ConcIncludeColumnNrs) THEN 'Exact duplicate' WHEN (C1.ConcIndexColumnNrs = C2.ConcIndexColumnNrs) THEN 'Different includes' ELSE 'Overlapping columns' END -- , C1.ConcIndexColumnNrs -- , C2.ConcIndexColumnNrs , C1.ConcIndexColumnNames , C2.ConcIndexColumnNames -- , C1.ConcIncludeColumnNrs -- , C2.ConcIncludeColumnNrs , C1.ConcIncludeColumnNames , C2.ConcIncludeColumnNames , C1.IndexUsage , C2.IndexUsage , C1.IndexUpdates , C2.IndexUpdates , 'DROP INDEX ' + C2.IndexName + ' ON ' + C2.TableName AS Drop2 , 'DROP INDEX ' + C1.IndexName + ' ON ' + C1.TableName AS Drop1 FROM IndexColumns AS C1 INNER JOIN IndexColumns AS C2 ON (C1.TableObjectId = C2.TableObjectId) AND ( -- exact: show lower IndexId as 1 (C1.IndexId < C2.IndexId AND C1.ConcIndexColumnNrs = C2.ConcIndexColumnNrs AND C1.ConcIncludeColumnNrs = C2.ConcIncludeColumnNrs) -- different includes: show longer include as 1 OR (C1.ConcIndexColumnNrs = C2.ConcIndexColumnNrs AND LEN(C1.ConcIncludeColumnNrs) > LEN(C2.ConcIncludeColumnNrs)) -- overlapping: show longer index as 1 OR (C1.IndexId <> C2.IndexId AND C1.ConcIndexColumnNrs <> C2.ConcIndexColumnNrs AND C1.ConcIndexColumnNrs like C2.ConcIndexColumnNrs + ' %') ) ORDER BY C1.TableName, C1.ConcIndexColumnNrs 
+1
source share

I created the following query, which gives me a lot of good information to identify duplicate and almost duplicate indexes. It also includes other information, such as how many pages of memory an index occupies, which allows me to give higher priority to larger indexes. It shows which columns are indexed and which columns are included, so I can see if there are two indexes that are almost identical only to the small changes in the included columns.

 WITH IndexSummary AS ( SELECT DISTINCT sys.objects.name AS [Table Name], sys.indexes.name AS [Index Name], SUBSTRING((SELECT ', ' + sys.columns.Name as [text()] FROM sys.columns INNER JOIN sys.index_columns ON sys.index_columns.column_id = sys.columns.column_id AND sys.index_columns.object_id = sys.columns.object_id WHERE sys.index_columns.index_id = sys.indexes.index_id AND sys.index_columns.object_id = sys.indexes.object_id AND sys.index_columns.is_included_column = 0 ORDER BY sys.columns.name FOR XML Path('')), 2, 10000) AS [Indexed Column Names], ISNULL(SUBSTRING((SELECT ', ' + sys.columns.Name as [text()] FROM sys.columns INNER JOIN sys.index_columns ON sys.index_columns.column_id = sys.columns.column_id AND sys.index_columns.object_id = sys.columns.object_id WHERE sys.index_columns.index_id = sys.indexes.index_id AND sys.index_columns.object_id = sys.indexes.object_id AND sys.index_columns.is_included_column = 1 ORDER BY sys.columns.name FOR XML Path('')), 2, 10000), '') AS [Included Column Names], sys.indexes.index_id, sys.indexes.object_id FROM sys.indexes INNER JOIN SYS.index_columns ON sys.indexes.index_id = SYS.index_columns.index_id AND sys.indexes.object_id = sys.index_columns.object_id INNER JOIN sys.objects ON sys.OBJECTS.object_id = SYS.indexES.object_id WHERE sys.objects.type = 'U' ) SELECT IndexSummary.[Table Name], IndexSummary.[Index Name], IndexSummary.[Indexed Column Names], IndexSummary.[Included Column Names], PhysicalStats.page_count as [Page Count], CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Size (MB)], CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragment %] FROM IndexSummary INNER JOIN sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS PhysicalStats ON PhysicalStats.index_id = IndexSummary.index_id AND PhysicalStats.object_id = IndexSummary.object_id WHERE (SELECT COUNT(*) as Computed FROM IndexSummary Summary2 WHERE Summary2.[Table Name] = IndexSummary.[Table Name] AND Summary2.[Indexed Column Names] = IndexSummary.[Indexed Column Names]) > 1 ORDER BY [Table Name], [Index Name], [Indexed Column Names], [Included Column Names] 

The query results are as follows:

 Table Name Index Indexed Cols Included Cols Pages Size (MB) Frag % My_Table Indx_1 Col1 Col2, Col3 123 0.96 8.94 My_Table Indx_2 Col1 Col2, Col3 123 0.96 8.94 

Full description

For a complete explanation, see Identifying Duplicate or Redundant Indexes in SQL Server .

+1
source share

Try the script below to show unused indexes, hope this helps

 /**************************************************************** Description: Script to show Unused Indexes using DMVs ****************************************************************/ SELECT TOP 100 o.name AS ObjectName , i.name AS IndexName , i.index_id AS IndexID , dm_ius.user_seeks AS UserSeek , dm_ius.user_scans AS UserScans , dm_ius.user_lookups AS UserLookups , dm_ius.user_updates AS UserUpdates , p.TableRows , 'DROP INDEX ' + QUOTENAME(i.name) + ' ON ' + QUOTENAME(s.name) + '.' + QUOTENAME(OBJECT_NAME(dm_ius.object_id)) as 'drop statement' FROM sys.dm_db_index_usage_stats dm_ius INNER JOIN sys.indexes i ON i.index_id = dm_ius.index_id AND dm_ius.object_id = i.object_id INNER JOIN sys.objects o on dm_ius.object_id = o.object_id INNER JOIN sys.schemas s on o.schema_id = s.schema_id INNER JOIN (SELECT SUM(p.rows) TableRows, p.index_id, p.object_id FROM sys.partitions p GROUP BY p.index_id, p.object_id) p ON p.index_id = dm_ius.index_id AND dm_ius.object_id = p.object_id WHERE OBJECTPROPERTY(dm_ius.object_id,'IsUserTable') = 1 AND dm_ius.database_id = DB_ID() AND i.type_desc = 'nonclustered' AND i.is_primary_key = 0 AND i.is_unique_constraint = 0 ORDER BY (dm_ius.user_seeks + dm_ius.user_scans + dm_ius.user_lookups) ASC GO 
+1
source share

I just read the MSDN blogs, noticed a script to do this , and remembered this question.

I did not bother to test it next to Andomar to find out if there is any particular advantage over another.

One amendment, which I will probably make for both, will take into account the sizes of both indices in estimating redundancy.

Edit:

Also see Kimberly Tripp's post on Removing Duplicate Indexes

0
source share

All Articles