Detect Suspect Database Pages with msdb.dbo.suspect_pages
Storage errors do not always surface as immediate failures. When SQL Server encounters an 823, 824, or 825 error reading or writing a database page, it records the offending page in msdb.dbo.suspect_pages — a system table that accumulates page-level I/O failures across service restarts. Querying this table is the fastest first-pass for any corruption investigation, answering whether bad pages exist and which databases are affected before committing to a full DBCC CHECKDB run.
Purpose and Overview
msdb.dbo.suspect_pages is SQL Server's persistent registry of page-level I/O failures. Unlike an error log entry that requires text parsing, this table exposes structured data: one row per bad page, with the database ID, file ID, page number, error type, cumulative error count, and the timestamp of the last failure. Entries survive SQL Server restarts and accumulate over time, so a page flagged months ago but never repaired remains visible until a DBA explicitly removes the entry or the page is restored.
The table captures three classes of physical-integrity errors. Event type 1 covers 823 and 824 errors — OS-level I/O failures and checksum mismatches that do not indicate a torn page specifically. Event types 2 and 3 represent torn pages: rows where the page checksum computed at read time does not match what was written, or where a page header indicates it was only partially flushed to disk. These three types are the active corruption signals; event types 4, 5, and 7 mark pages that have been resolved by a backup restore or a DBCC CHECKDB repair operation.
The error_count column is particularly diagnostic: a value greater than 1 on a page means SQL Server has failed to read or verify that page multiple times since the last update. Repeated failures against the same page indicate a hardware fault that is worsening — a drive sector in decline — rather than a transient I/O blip that self-corrected. For a DBA triaging a production alert, that single column separates the "monitor it" situations from the "restore from backup now" situations.
One important operational boundary: msdb.dbo.suspect_pages is limited to 1,000 rows. When the table reaches capacity, SQL Server begins logging additional errors to the SQL Server error log rather than writing new rows. A full suspect_pages table is itself a serious signal — it means corruption events are outpacing DBA response. Monitoring row count alongside the error entries is part of a complete integrity-monitoring strategy.
Code Breakdown
The detection script queries the table for all active (unresolved) suspect pages, joins to sys.master_files for the physical file path, and decodes the event_type integer into a readable description.
1USE msdb;
2GO
3
4SELECT
5 DB_NAME(sp.database_id) AS database_name,
6 mf.physical_name AS file_path,
7 sp.file_id,
8 sp.page_id,
9 CASE sp.event_type
10 WHEN 1 THEN '823/824 - Hard I/O error or bad checksum'
11 WHEN 2 THEN 'Torn page (read error)'
12 WHEN 3 THEN 'Torn page (bad checksum)'
13 WHEN 4 THEN 'Restored from backup'
14 WHEN 5 THEN 'Repaired by DBCC CHECKDB'
15 WHEN 7 THEN 'Deallocated by DBCC CHECKDB'
16 ELSE 'Unknown event type'
17 END AS event_type_desc,
18 sp.error_count,
19 sp.last_update_date
20FROM msdb.dbo.suspect_pages AS sp
21LEFT JOIN sys.master_files AS mf
22 ON sp.database_id = mf.database_id
23 AND sp.file_id = mf.file_id
24WHERE sp.event_type IN (1, 2, 3)
25ORDER BY sp.last_update_date DESC;
To view all entries including resolved pages — useful for a historical audit of prior repair activity:
1SELECT
2 DB_NAME(sp.database_id) AS database_name,
3 sp.file_id,
4 sp.page_id,
5 sp.event_type,
6 sp.error_count,
7 sp.last_update_date
8FROM msdb.dbo.suspect_pages AS sp
9ORDER BY sp.last_update_date DESC;
Running in the msdb context
The USE msdb statement is required because suspect_pages is a system table in the msdb database, not in each user database. All page failures across every database on the instance funnel into a single msdb table — there is no per-database scoped equivalent. Any user with SELECT permission on msdb.dbo.suspect_pages can read it; the sysadmin fixed server role and the db_datareader role on msdb are the usual access paths for monitoring accounts.
Filtering on event_type IN (1, 2, 3)
The WHERE event_type IN (1, 2, 3) predicate isolates the three active error categories and suppresses rows representing already-resolved pages. Without this filter, the result set mixes current problems with historical entries for pages that DBCC CHECKDB has repaired or that a backup restore has corrected. In a post-incident review, dropping the filter and examining the full history is useful; in a monitoring or alerting context, the filter keeps the result focused on pages still requiring attention.
Joining to sys.master_files for physical paths
msdb.dbo.suspect_pages stores only integer IDs — database_id, file_id, and page_id. The LEFT JOIN to sys.master_files resolves those integers to a physical_name, adding the full path of the affected data or log file. This is practical for storage-team handoff: the output names not just the SQL Server database but the actual file on the storage controller, identifying whether the bad pages originate from a specific LUN, drive, or SAN volume that can be isolated or replaced. The join uses LEFT JOIN rather than INNER JOIN to preserve rows for offline or detached databases whose files no longer appear in sys.master_files.
Decoding event_type
The CASE expression translates the integer event_type into a human-readable description. The six defined values cover the full lifecycle of a suspect page: type 1 (823 or 824 error — the broadest category, covering hard I/O faults and bad checksums when page verification is set to CHECKSUM), types 2 and 3 (torn page detected during a read and flagged by a checksum mismatch respectively), and types 4, 5, and 7 (page restored from a backup, repaired by DBCC CHECKDB WITH REPAIR_ALLOW_DATA_LOSS, and deallocated by DBCC CHECKDB respectively). Types 4, 5, and 7 indicate prior resolution and are suppressed in the active-errors query by the event_type IN (1, 2, 3) filter.
Table capacity monitoring
A companion query that shows how close the table is to its 1,000-row hard limit:
1SELECT
2 COUNT(*) AS total_rows,
3 1000 - COUNT(*) AS rows_remaining,
4 MAX(last_update_date) AS most_recent_event
5FROM msdb.dbo.suspect_pages;
When total_rows approaches 1,000, the underlying corruption problem is active and widening. At that point new events begin going exclusively to the SQL Server error log, and the structured query-based monitoring this script provides starts losing visibility.
Key Benefits and Use Cases
- Instant corruption visibility — a single query surfaces all unresolved bad pages across every database on the instance without running a full
DBCC CHECKDB. - Repeated-failure identification —
error_countdistinguishes transient I/O blips from deteriorating hardware by counting how many times SQL Server has failed on the same page. - Storage-team handoff — joining to
sys.master_filesmaps the SQL Server file ID to a physical path, enabling storage-layer investigation without ambiguity. - Repair triage — the resolved event types (4, 5, 7) confirm that prior
DBCC CHECKDBrepairs wrote back to the table, providing an auditable record of remediation activity. - Capacity alerting — tracking total row count catches the approach to the 1,000-row limit before overflow to the error log removes monitoring visibility.
- Lightweight baseline — the detection query costs nothing on a clean instance and only produces an alert-worthy result when bad pages actually exist.
Performance Considerations
msdbcontext is mandatory: the query must referencemsdb.dbo.suspect_pagesexplicitly; running it without the three-part name or withoutUSE msdbfails with an object-not-found error.- 1,000-row hard limit: the table cannot grow beyond 1,000 rows. When full, overflow events go to the SQL Server error log only — row-count monitoring is essential to avoid missing entries in heavily degraded environments.
- Resolved rows are not purged automatically: SQL Server adds rows for new failures and updates
last_update_dateanderror_counton repeat failures, but it does not delete resolved rows. Over time, rows with event types 4, 5, and 7 accumulate. ExplicitDELETEof resolved rows — typically after confirming repair withDBCC CHECKDB— is required to keep the table manageable and the 1,000-row capacity limit meaningful. - Left join behavior on offline databases:
sys.master_filesdoes not list files for databases in an offline, restoring, or suspect state; theLEFT JOINreturnsNULLforphysical_namein those cases. ANULLfile path signals a database that may itself be inaccessible, not a join error. - No automatic alerting built in: the table accumulates silently. SQL Server does not fire a notification when a new row is inserted. A SQL Server Agent job polling this query and sending an alert on any non-empty result is the standard monitoring implementation.
Practical Tips
- Schedule a SQL Server Agent job that queries
suspect_pages WHERE event_type IN (1, 2, 3)and sends a Database Mail notification when the result set returns any rows — without active polling, new entries accumulate with no visibility. - After confirming a page has been repaired by backup restore or
DBCC CHECKDB, update theevent_typeto the appropriate resolved value or delete the row fromsuspect_pagesto keep the active error list accurate and preserve capacity for new entries. - Pair the detection script with
DBCC PAGE (database_name, file_id, page_id, 3)to inspect the raw contents of a flagged page and determine whether it is a data or index page — this determines whetherREPAIR_ALLOW_DATA_LOSSis acceptable or whether a restore is the only sound option. - When
error_countexceeds 3 on any single page, treat it as a hardware signal rather than a software one: escalate to storage-team review before attemptingDBCCrepair, because additional writes to a failing sector may widen corruption. - Cross-reference flagged
database_idvalues against your backup schedule immediately — a database with active suspect pages and no recent clean backup is at acute risk. Confirm a restore point exists before running any repair operation.
Conclusion
msdb.dbo.suspect_pages is the most direct window into active page-level corruption across a SQL Server instance: a structured, queryable table that records each I/O failure event, survives service restarts, and accumulates repeat-failure counts that distinguish deteriorating hardware from transient errors. Querying it with the detection script above — filtering to active event types and joining to sys.master_files for physical file paths — provides the fast first-pass that every corruption-response workflow should begin with. Combined with a scheduled Agent job that alerts on any new rows, it closes the monitoring gap that allows page corruption to accumulate undetected.
References
- suspect_pages (Transact-SQL) — Microsoft Learn — Full column reference including
event_typevalues, the 1,000-row capacity limit, and the conditions under which each error category is written. - Manage the suspect_pages Table (SQL Server) — Microsoft Learn — Guidance on monitoring, updating, and purging resolved rows from the table, including when
DBCC CHECKDBrepair is appropriate versus backup restore. - sys.master_files (Transact-SQL) — Microsoft Learn — Server-scope catalog view providing
physical_nameandtype_descfor the(database_id, file_id)join used to resolve physical file paths. - sqlserver-kit on GitHub by Konstantin Taranov — Community collection of SQL Server DBA scripts including storage integrity checks and database maintenance queries.
Posts in this series
- Generate DBCC SHOWCONTIG Commands for All SQL Server Tables
- DBCC DBREINDEX: SQL Server Index Rebuild Script
- SQL Server DBCC DBREINDEX with Timing: Index Maintenance
- SQL Server Index Fragmentation: dm_db_index_physical_stats
- SQL Server DBCC CHECKDB Last Run Date Report
- SQL Server VLF Count Report: Virtual Log File Analysis
- SQL Server Duplicate and Overlapping Index Detection Script
- DBCC DBREINDEX vs ALTER INDEX REBUILD in SQL Server
- SQL Server DBCC SHRINKDATABASE: When and When Not to Shrink
- SQL Server DBCC CHECKDB: Complete Guide and Repair Options
- SQL Server DBCC Commands: The Complete DBA Reference Guide
- How to Update Statistics in SQL Server
- Apply Row and Page Compression to Tables and Indexes
- Detect Suspect Database Pages with msdb.dbo.suspect_pages