SQL Server Duplicate and Overlapping Index Detection Script

Duplicate and overlapping indexes waste storage, slow down every write, and split optimizer statistics across redundant copies. This T-SQL script compares each non-heap index in the current database on its first sixteen key columns and surfaces any pair of indexes that overlap, giving the DBA a short list of cleanup candidates.

Purpose and Overview

sp_BlitzIndex does deep usage and statistics analysis across every index. sp_SQLskills_finddupes factors in INCLUDE columns and sort direction. The script below fills a narrower role between them: a portable, dependency-free first pass for the single most common form of index redundancy — two non-heap indexes on the same table that share a leading key column. The output is a candidate list ready for human triage, not a drop list to apply unattended.

The query enumerates every non-heap index in the current database via sys.indexes joined to sys.objects and sys.schemas. For each index it calls the INDEX_COL system function sixteen times to capture the names of the first sixteen key columns (the historical SQL Server limit on non-clustered key columns prior to SQL Server 2016). A self-join on the resulting common table expression compares each index against every other index on the same table, using a NULL-tolerant predicate on columns 2–16 so that prefix overlaps — narrower indexes whose keys are a leading subset of a wider index — surface alongside exact duplicates.

Code Breakdown

The complete script is shown below, followed by an explanation of each section. Run it in the database whose indexes you want to audit.

 1;WITH MyDuplicate AS (
 2SELECT
 3  Sch.[name] AS SchemaName,
 4  Obj.[name] AS TableName,
 5  Idx.[name] AS IndexName,
 6  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 1)  AS Col1,
 7  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 2)  AS Col2,
 8  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 3)  AS Col3,
 9  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 4)  AS Col4,
10  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 5)  AS Col5,
11  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 6)  AS Col6,
12  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 7)  AS Col7,
13  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 8)  AS Col8,
14  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 9)  AS Col9,
15  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 10) AS Col10,
16  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 11) AS Col11,
17  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 12) AS Col12,
18  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 13) AS Col13,
19  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 14) AS Col14,
20  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 15) AS Col15,
21  INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 16) AS Col16
22FROM sys.indexes Idx
23INNER JOIN sys.objects Obj ON Idx.[object_id]  = Obj.[object_id]
24INNER JOIN sys.schemas Sch ON Sch.[schema_id] = Obj.[schema_id]
25WHERE index_id > 0
26)
27SELECT
28  MD1.SchemaName, MD1.TableName, MD1.IndexName,
29  MD2.IndexName AS OverLappingIndex,
30  MD1.Col1, MD1.Col2, MD1.Col3, MD1.Col4,
31  MD1.Col5, MD1.Col6, MD1.Col7, MD1.Col8,
32  MD1.Col9, MD1.Col10, MD1.Col11, MD1.Col12,
33  MD1.Col13, MD1.Col14, MD1.Col15, MD1.Col16
34FROM MyDuplicate MD1
35INNER JOIN MyDuplicate MD2 ON MD1.TableName = MD2.TableName
36AND MD1.IndexName <> MD2.IndexName
37AND MD1.Col1 = MD2.Col1
38AND (MD1.Col2  IS NULL OR MD2.Col2  IS NULL OR MD1.Col2  = MD2.Col2)
39AND (MD1.Col3  IS NULL OR MD2.Col3  IS NULL OR MD1.Col3  = MD2.Col3)
40AND (MD1.Col4  IS NULL OR MD2.Col4  IS NULL OR MD1.Col4  = MD2.Col4)
41AND (MD1.Col5  IS NULL OR MD2.Col5  IS NULL OR MD1.Col5  = MD2.Col5)
42AND (MD1.Col6  IS NULL OR MD2.Col6  IS NULL OR MD1.Col6  = MD2.Col6)
43AND (MD1.Col7  IS NULL OR MD2.Col7  IS NULL OR MD1.Col7  = MD2.Col7)
44AND (MD1.Col8  IS NULL OR MD2.Col8  IS NULL OR MD1.Col8  = MD2.Col8)
45AND (MD1.Col9  IS NULL OR MD2.Col9  IS NULL OR MD1.Col9  = MD2.Col9)
46AND (MD1.Col10 IS NULL OR MD2.Col10 IS NULL OR MD1.Col10 = MD2.Col10)
47AND (MD1.Col11 IS NULL OR MD2.Col11 IS NULL OR MD1.Col11 = MD2.Col11)
48AND (MD1.Col12 IS NULL OR MD2.Col12 IS NULL OR MD1.Col12 = MD2.Col12)
49AND (MD1.Col13 IS NULL OR MD2.Col13 IS NULL OR MD1.Col13 = MD2.Col13)
50AND (MD1.Col14 IS NULL OR MD2.Col14 IS NULL OR MD1.Col14 = MD2.Col14)
51AND (MD1.Col15 IS NULL OR MD2.Col15 IS NULL OR MD1.Col15 = MD2.Col15)
52AND (MD1.Col16 IS NULL OR MD2.Col16 IS NULL OR MD1.Col16 = MD2.Col16)
53ORDER BY MD1.SchemaName, MD1.TableName, MD1.IndexName;

CTE: MyDuplicate — One Row per Index, Key Columns Spread Out

The MyDuplicate CTE produces one row per non-heap index. The join from sys.indexes to sys.objects resolves the parent table; the join to sys.schemas resolves the schema name. The filter WHERE index_id > 0 is what excludes heaps: sys.indexes returns one row per index, and the row with index_id = 0 is the heap (the table itself when no clustered index exists). Clustered indexes have index_id = 1 and non-clustered indexes have index_id values from 2 upward, so index_id > 0 keeps both.

The sixteen INDEX_COL calls each return the name of the Nth key column of the index, or NULL if the index has fewer than N key columns. Passing the two-part table name 'schema.table' is the documented call shape for INDEX_COL.

Self-Join on TableName — Two Indexes on the Same Table

The main SELECT joins the CTE to itself with the alias pair MD1 and MD2. The first predicate MD1.TableName = MD2.TableName restricts each comparison to indexes on the same table — comparing indexes across tables would be meaningless. The predicate MD1.IndexName <> MD2.IndexName prevents the script from matching every index against itself. Because the self-join is symmetric, every duplicate pair appears twice in the output (once as A/B and once as B/A) — a feature, since the asymmetric matching with NULL on either side means each direction can detect different overlaps.

The NULL-Tolerant Predicate — Why Three OR Clauses per Column

The key insight is the predicate structure on Col2 through Col16:

1(MD1.Col2 IS NULL OR MD2.Col2 IS NULL OR MD1.Col2 = MD2.Col2)

This is true when either index has fewer than N key columns, or both have the same Nth column. That means an index IX_A (Col1, Col2) overlaps with IX_B (Col1) because IX_B.Col2 is NULL — IX_B is a prefix of IX_A and therefore redundant for any query that could use IX_B. The leading column Col1 is required to match exactly (no NULL tolerance) because two indexes that don't share a leading column are functionally unrelated for seek purposes.

Key Benefits and Use Cases

  • Surfaces redundant single-column indexes that duplicate the leading column of a wider composite index — the most common form of true duplicate
  • Flags exact duplicates where two indexes have the same key columns in the same order, typically created by separate scripts that didn't check for an existing index
  • Identifies prefix overlaps where one index is the leading prefix of another, making the narrower index redundant for query-seek purposes
  • One result set, no external tools — pure T-SQL against catalog views, runs on every supported SQL Server edition
  • Schema-aware output — the result set includes the schema name so multi-schema databases are handled correctly
  • Quick first pass before a deeper review with sp_BlitzIndex or sp_SQLskills_finddupes for include-column and filter-definition analysis

Performance Considerations

  • Lightweight on catalog views: the script reads only sys.indexes, sys.objects, sys.schemas, and calls INDEX_COL — all metadata operations, no user data scanned. Runs in seconds even on databases with thousands of indexes.
  • Key-column-only comparison: this script compares only the key columns of each index. It does not look at included columns, filter definitions, ASC/DESC sort directions, or whether one index is unique. Two indexes with identical key columns but different INCLUDE lists are flagged as overlapping by this script — review the INCLUDE list manually before dropping.
  • Sixteen-column cap: the INDEX_COL calls go only to column 16. Indexes with seventeen or more key columns (rare, but legal in modern SQL Server) won't have their tail compared. In practice, sixteen is enough for almost every real-world index.
  • Self-join cost grows with index count: the self-join is O(n²) over the number of indexes per table, but n per table is usually small (<10) so this is not a practical concern.
  • Symmetric output: every duplicate pair appears twice in the result set. Add AND MD1.IndexName < MD2.IndexName to suppress the mirror row if you want a one-line-per-pair view.

Practical Tips

  • Don't blindly drop anything this script flags. Look up index usage in sys.dm_db_index_usage_stats to confirm both candidates are actually being used (or unused). A redundant-looking index might be the one the optimizer prefers due to better statistics or smaller leaf-level pages.
  • Review INCLUDE columns before dropping: an index that looks like a key-column duplicate may carry distinct INCLUDE columns serving a separate covering query. Pull sys.index_columns filtered on is_included_column = 1 for both indexes before deciding.
  • Save the output as a starting point for an index consolidation review. Pair each row with the index's last-used timestamp, write activity, and any documented owner before dropping.
  • Run it monthly on busy OLTP databases. Index sprawl is a slow accumulation; catching duplicates as they appear is cheaper than a once-a-year cleanup.
  • For include-column and ASC/DESC aware analysis, follow up with sp_SQLskills_finddupes or sp_BlitzIndex — both cover dimensions this script intentionally omits.

Conclusion

This duplicate-index detection query is a fast, dependency-free first pass that identifies the highest-frequency redundancy pattern: two non-heap indexes on the same table that share a leading key column set. Running it as a regular monthly check keeps index sprawl visible and shrinks the gap between when a redundant index is created and when it is removed. Use the output as a candidate list, validate each pair with sys.dm_db_index_usage_stats and a quick INCLUDE-column comparison, then drop the loser.

References

Posts in this series