Apply Row and Page Compression to Tables and Indexes

Storage budgets shrink faster than databases grow. Uncompressed tables and indexes often carry two to four times the physical page count that row or page compression would require — a gap that widens with every batch load or continuous INSERT. sys.dm_db_index_physical_stats joined to sys.partitions surfaces every uncompressed index and heap in a database in one pass, and the script below converts that inventory into ready-to-execute ALTER INDEX REBUILD and ALTER TABLE REBUILD statements sorted by object size so the highest-impact targets come first.

Purpose and Overview

Row and page compression were introduced in SQL Server 2008, initially restricted to Enterprise and Developer editions. SQL Server 2016 SP1 removed that edition gate — both compression types became available in Standard, Web, and Express editions from that release forward. That licensing change made compression a practical maintenance tool for any production workload, not just high-end deployments, which is why identifying and applying compression now belongs on the same routine maintenance checklist as index rebuilds and statistics updates.

The central challenge is knowing which objects to compress. A production database may carry hundreds of indexes across dozens of tables, a mix of heap tables and clustered indexes, and varying partition counts — none of which is visible in a single system view without joining several catalog objects. sys.dm_db_index_physical_stats bridges that gap: it reports physical storage statistics for every index and heap in scope, including the page count that drives the compression ROI calculation. Joining it to sys.partitions adds the data_compression_desc column that shows the current compression state — NONE, ROW, PAGE, or COLUMNSTORE — making it straightforward to filter out objects already compressed.

The script produces a diagnostic result set showing the current state of every uncompressed object (page count, fragmentation percentage, object name, and current compression), alongside a generated T-SQL column containing the exact ALTER statement needed to apply ROW compression. A DBA reviewing the output can switch the target compression type from ROW to PAGE on any row before executing, or pipe the generated statements into a cursor-driven loop for batch processing.

sys.dm_db_index_physical_stats accepts five parameters: database ID, object ID, index ID, partition number, and scan mode. Passing NULL for parameters two through four returns every object in the current database context across all indexes and all partitions. The scan mode — LIMITED, SAMPLED, or DETAILED — controls how much physical I/O the function performs. LIMITED reads only allocation metadata and runs in seconds on even large databases, making it the right choice for generating a compression candidate list without scheduling a maintenance window.

Code Breakdown

The script runs in the context of the target database, queries sys.dm_db_index_physical_stats in LIMITED mode, and joins to sys.indexes, sys.objects, sys.schemas, and sys.partitions to produce both a diagnostic view and ready-to-run ALTER statements.

 1USE [YourDatabase];
 2GO
 3
 4SELECT
 5    s.name                                               AS schema_name,
 6    o.name                                               AS table_name,
 7    CASE i.index_id
 8        WHEN 0 THEN 'HEAP'
 9        ELSE        i.name
10    END                                                  AS index_name,
11    i.index_id,
12    i.type_desc                                          AS index_type,
13    p.partition_number,
14    ps.page_count,
15    CAST(ps.avg_fragmentation_in_percent AS DECIMAL(5,1))
16                                                         AS frag_pct,
17    p.data_compression_desc                              AS current_compression,
18    CASE
19        WHEN i.index_id = 0
20            -- Heap: uses ALTER TABLE syntax
21            THEN 'ALTER TABLE ['  + s.name + '].['  + o.name + ']'
22                 + ' REBUILD WITH (DATA_COMPRESSION = ROW);'
23        ELSE
24            -- Clustered and non-clustered indexes
25            'ALTER INDEX ['  + i.name + '] ON ['  + s.name + '].['  + o.name + ']'
26            + ' REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = ROW);'
27    END                                                  AS compression_command
28FROM sys.dm_db_index_physical_stats(
29         DB_ID(), NULL, NULL, NULL, 'LIMITED') AS ps
30INNER JOIN sys.indexes AS i
31    ON  ps.object_id = i.object_id
32    AND ps.index_id  = i.index_id
33INNER JOIN sys.objects AS o
34    ON  i.object_id  = o.object_id
35INNER JOIN sys.schemas AS s
36    ON  o.schema_id  = s.schema_id
37INNER JOIN sys.partitions AS p
38    ON  ps.object_id        = p.object_id
39    AND ps.index_id         = p.index_id
40    AND ps.partition_number = p.partition_number
41WHERE o.type_desc             = 'USER_TABLE'
42  AND ps.page_count           > 100
43  AND p.data_compression_desc = 'NONE'
44ORDER BY ps.page_count DESC;

Calling sys.dm_db_index_physical_stats

sys.dm_db_index_physical_stats is a table-valued function, not a view, so it requires explicit parentheses and arguments. Passing DB_ID() as the first argument scopes output to the current database, which is the intended behavior after the USE statement. Arguments two through four — object ID, index ID, partition number — accept NULL to mean "all," producing one row per index per partition across the entire database. The fifth argument is the scan mode, set to 'LIMITED' here. In LIMITED mode the function reads only the Index Allocation Map (IAM) pages to derive page counts and extent counts; it does not scan data pages. This makes it fast enough for an inventory query on a busy production instance. The tradeoff is that avg_fragmentation_in_percent is not populated in LIMITED mode — change the mode to 'SAMPLED' to get a one-percent data-page sample that populates fragmentation alongside the compression state, at some additional I/O cost.

Joining to sys.indexes, sys.objects, and sys.schemas

sys.indexes resolves the (object_id, index_id) pair from the DMV to an index name and type description. An index_id of 0 represents a heap — a table with no clustered index — and its name column is NULL, so the CASE expression substitutes the literal 'HEAP' to keep output readable. sys.objects maps object_id to the table name and its type_desc, filtered to 'USER_TABLE' to exclude system tables, internal tables, queue objects, and views. sys.schemas resolves schema_id to the schema name, which both the diagnostic columns and the generated ALTER statement require.

Joining to sys.partitions for current compression state

sys.partitions stores one row per partition per index and is the only system object that exposes the current compression state as a column (data_compression_desc). The three-part join key — (object_id, index_id, partition_number) — uniquely identifies each partition instance-wide. Filtering on data_compression_desc = 'NONE' restricts the result to objects not yet compressed, preventing the script from generating redundant rebuild statements on indexes already carrying ROW or PAGE compression. On non-partitioned tables partition_number is always 1, so the join degenerates cleanly to a one-to-one match.

Generating ALTER statements for heaps and indexes

The CASE expression on i.index_id handles the syntactic difference between compressing a heap and compressing a named index. Heaps require ALTER TABLE [schema].[table] REBUILD WITH (DATA_COMPRESSION = ROW) syntax — there is no index name to reference, and the SQL Server engine internally rebuilds the IAM chain and data pages for the heap. Named indexes use ALTER INDEX [name] ON [schema].[table] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = ROW). Including PARTITION = ALL applies compression to every partition of a partitioned index in a single statement; for non-partitioned indexes it is a no-op and causes no error. The generated compression type defaults to ROW throughout; changing it to PAGE before execution is a single text substitution in the output column.

Filtering by page count

The ps.page_count > 100 predicate excludes very small objects. Compressing a table smaller than roughly 800 KB (100 eight-kilobyte pages) yields negligible space savings while still triggering a full rebuild and its associated transaction log activity. The threshold is configurable — raise it to 1000 or 10000 to restrict a first compression pass to the largest tables only, which is appropriate when the maintenance window is short.

Key Benefits and Use Cases

  • Single-pass database-wide inventory — one query identifies every uncompressed user-table index and heap without iterating object-by-object or relying on sp_spaceused loops.
  • Ready-to-execute output — the generated ALTER statements require no manual assembly; a DBA reviews the result set, optionally adjusts the compression type per row, and executes directly.
  • Size-ordered prioritizationORDER BY ps.page_count DESC surfaces the highest-storage-impact candidates first, allowing a partial pass within a limited maintenance window to deliver the most savings.
  • Heap and index coverage in one result — the CASE expression produces the correct syntax for both heap rebuilds and index rebuilds, covering every compressible object type without a separate query.
  • Current-state transparencydata_compression_desc confirms what compression is already applied, preventing double-compression on objects that were rebuilt in a prior maintenance cycle.
  • Edition-agnostic from SQL Server 2016 SP1 onward — the generated statements are legal on Standard, Web, Express, and Enterprise editions, making the script applicable across mixed-edition environments.

Performance Considerations

  • LIMITED mode does not populate fragmentation percentage: avg_fragmentation_in_percent returns NULL when scan mode is 'LIMITED'. Switch to 'SAMPLED' when the goal is to combine a compression candidate scan with a fragmentation-based rebuild-or-reorganize decision in a single pass — at the cost of a one-percent data-page read across the database.
  • Rebuilding simultaneously compresses and defragments: ALTER INDEX REBUILD WITH (DATA_COMPRESSION = ROW) applies compression and rebuilds the index in one operation. Running a separate fragmentation-driven rebuild before or after on the same index is redundant and doubles the log and CPU cost.
  • Transaction log space during rebuild: each ALTER INDEX REBUILD or ALTER TABLE REBUILD is fully logged by default. On very large tables the transaction log must have capacity for the rebuild operation. Use ONLINE = ON where the edition permits to keep the table accessible during the rebuild; on Standard edition, rebuilds acquire a brief table lock at the start and end.
  • Partitioned tables produce multiple rows: the result set returns one row per partition for partitioned indexes. Review whether applying PARTITION = ALL in one statement is appropriate, or whether rolling individual-partition rebuilds — REBUILD PARTITION = N — are needed to stage the work across multiple maintenance windows.
  • sys.dm_db_index_physical_stats in DETAILED mode carries significant I/O cost: if scan mode is ever changed from LIMITED to DETAILED for a full fragmentation scan, scope the call to a single object by passing a specific object_id argument rather than NULL, particularly on large databases.

Practical Tips

  • Run sp_estimate_data_compression_savings on the top candidates from this script before committing to a rebuild — it samples existing pages and projects space savings for both ROW and PAGE modes, enabling an informed choice of compression type before incurring rebuild cost.
  • Schedule the script as a SQL Server Agent job that writes results to a monitoring table on a regular interval; tables added after the initial compression pass will surface in subsequent runs without requiring manual rediscovery.
  • Prefer PAGE compression over ROW compression for read-heavy tables with low-cardinality, repetitive columns — fact tables, wide audit logs, and denormalized staging tables typically see significantly higher compression ratios under PAGE than under ROW.
  • Apply compression to all non-clustered indexes on a table in the same maintenance window as the clustered index or heap rebuild; compressing them separately doubles the downtime and log activity for the same net result.
  • After completing a compression pass, verify the results by querying sys.partitions directly — SELECT OBJECT_NAME(object_id), index_id, data_compression_desc FROM sys.partitions WHERE data_compression_desc <> 'NONE' — rather than re-running the full sys.dm_db_index_physical_stats join, which avoids the function's allocation-scan overhead for a simple confirmation check.

Conclusion

sys.dm_db_index_physical_stats joined to sys.partitions produces the database-wide compression candidate inventory that index-by-index discovery cannot match in speed or completeness. The script above generates the full set of ALTER statements needed to move every uncompressed user-table index and heap to ROW compression, ordered from largest to smallest so the highest-impact work executes first within any maintenance window. Paired with sp_estimate_data_compression_savings for upfront sizing and a scheduled Agent job for ongoing discovery, it covers the entire compression maintenance lifecycle from initial survey to continuous enforcement.

References

Posts in this series