SQL Server Agent Job Failure History Report
Report SQL Server Agent Job Status and Failure History
This script queries the SQL Server Agent job history stored in msdb to produce a detailed report of each job's most recent run status, failure and success counts over the last 60 days, total lifetime run counts, and job step information including subsystem type and proxy name.
Purpose and Overview
SQL Server Agent jobs automate the critical work of database backups, index maintenance, integrity checks, ETL loads, and more. When a job fails, the impact can range from a missed backup to a halted data pipeline. Knowing which jobs failed, how recently they failed, and whether failures are isolated events or a recurring pattern is essential for any DBA. This script, written by Jason Brimhall and published in the ktaranov/sqlserver-kit community repository, uses a series of common table expressions to build a rich job history report. It shows the most recent outcome for each job status type, counts failures and successes within the last 60 days relative to the latest run, and includes job step details. To report only failures, uncomment one line. Run this script in the context of the msdb database.
Code Breakdown
The complete script is shown below, followed by an explanation of each section. Run this in msdb or prefix the unqualified table references with msdb.dbo..
1USE msdb;
2GO
3WITH jobhist AS (
4 SELECT
5 jh.job_id,
6 jh.run_date,
7 jh.run_status,
8 jh.step_id,
9 ROW_NUMBER() OVER (
10 PARTITION BY jh.job_id, jh.run_status
11 ORDER BY jh.run_date DESC
12 ) AS rownum
13 FROM dbo.sysjobhistory AS jh
14 WHERE jh.step_id = 0
15),
16agglastsixty AS (
17 SELECT
18 sjh.job_id,
19 sjh.run_status,
20 COUNT(*) AS RunCount
21 FROM dbo.sysjobhistory AS sjh
22 INNER JOIN jobhist AS jh
23 ON jh.job_id = sjh.job_id
24 AND jh.run_status = sjh.run_status
25 AND jh.step_id = sjh.step_id
26 WHERE CONVERT(DATE, CONVERT(VARCHAR(30), sjh.run_date))
27 > DATEADD(d, -60, CONVERT(DATE, CONVERT(VARCHAR(30), jh.run_date)))
28 AND jh.rownum = 1
29 GROUP BY sjh.job_id, sjh.run_status
30),
31aggtotal AS (
32 SELECT
33 sjh.job_id,
34 sjh.run_status,
35 COUNT(*) AS RunCount
36 FROM dbo.sysjobhistory AS sjh
37 INNER JOIN jobhist AS jh
38 ON jh.job_id = sjh.job_id
39 AND jh.run_status = sjh.run_status
40 AND jh.step_id = sjh.step_id
41 WHERE jh.rownum = 1
42 GROUP BY sjh.job_id, sjh.run_status
43)
44SELECT
45 j.name AS JobName,
46 sc.name AS CategoryName,
47 j.date_created,
48 j.enabled,
49 CONVERT(DATE, CONVERT(VARCHAR(30), oa.run_date)) AS RunDate,
50 CASE oa.run_status
51 WHEN 0 THEN 'Failed'
52 WHEN 1 THEN 'Succeeded'
53 WHEN 2 THEN 'Retry'
54 WHEN 3 THEN 'Canceled'
55 WHEN 4 THEN 'In Progress'
56 END AS run_status,
57 als.RunCount AS Last60StatusCount,
58 agt.RunCount AS TotalStatusCount,
59 js.subsystem AS JobStepSubsystem,
60 js.command,
61 spr.name AS ProxyName,
62 spr.credential_id
63FROM dbo.sysjobs AS j
64LEFT JOIN jobhist AS oa
65 ON oa.job_id = j.job_id
66LEFT OUTER JOIN agglastsixty AS als
67 ON als.job_id = oa.job_id
68 AND als.run_status = oa.run_status
69LEFT OUTER JOIN aggtotal AS agt
70 ON agt.job_id = oa.job_id
71 AND agt.run_status = oa.run_status
72INNER JOIN sys.server_principals AS sp
73 ON j.owner_sid = sp.sid
74INNER JOIN dbo.syscategories AS sc
75 ON j.category_id = sc.category_id
76INNER JOIN dbo.sysjobsteps AS js
77 ON js.job_id = j.job_id
78LEFT JOIN dbo.sysproxies AS spr
79 ON js.proxy_id = spr.proxy_id
80WHERE oa.rownum = 1
81-- AND oa.run_status = 0 -- uncomment to show only failed jobs
82ORDER BY RunDate DESC;
CTE 1: jobhist — Most Recent Run per Status per Job
The jobhist CTE reads from dbo.sysjobhistory and assigns a row number using ROW_NUMBER() partitioned by job_id and run_status, ordered by run_date DESC. This means row number 1 is the most recent time a job ended with each status code. The filter WHERE jh.step_id = 0 restricts to job-level outcome rows. In sysjobhistory, step_id 0 is always the summary row for the entire job — individual job steps have step_id values of 1, 2, 3, and so on. Filtering to step_id = 0 avoids double-counting.
CTE 2: agglastsixty — Count per Status in Last 60 Days
agglastsixty counts how many times each job ended with each run_status in the 60 days leading up to the most recent occurrence of that status. It joins back to jobhist on job_id, run_status, and step_id, then filters on rownum = 1 to anchor the date window to the most recent run. The DATEADD comparison finds all history rows within 60 days before that anchor date. This gives a rolling-60-day count without requiring a fixed calendar date in the query.
CTE 3: aggtotal — Lifetime Count per Status
aggtotal is the same structure as agglastsixty but without the 60-day date restriction. It counts every historical occurrence of each status for each job, giving a total lifetime run count. Comparing Last60StatusCount to TotalStatusCount shows whether failures are recent or historical.
Main SELECT: Job Details and Step Information
The main query joins dbo.sysjobs (one row per agent job) to the three CTEs, dbo.syscategories (job category like Database Maintenance or Data Collector), dbo.sysjobsteps (one row per step per job), and dbo.sysproxies (proxy accounts used by job steps for running under alternate credentials).
The CASE expression on oa.run_status translates the integer status codes into readable text: 0 = Failed, 1 = Succeeded, 2 = Retry, 3 = Canceled, 4 = In Progress.
The WHERE clause oa.rownum = 1 keeps only the most recent run row for each job and status. To restrict the output to failed jobs only, uncomment AND oa.run_status = 0.
Key Benefits and Use Cases
- Shows the most recent run outcome for every SQL Server Agent job in one result set
- Counts failures and successes over the last 60 days to identify recurring problem jobs
- Total lifetime count shows whether a failure is new or has been happening for a long time
- Job step subsystem column identifies T-SQL, SSIS, PowerShell, CmdExec, and other step types
- Proxy name column reveals which credential a step runs under, useful for permission troubleshooting
- Simple one-line change to filter the output to failed jobs only
Performance Considerations
- msdb context required: The table names in this script (dbo.sysjobs, dbo.sysjobhistory, etc.) are unqualified — they assume the current database is msdb. Run the script after
USE msdb;or prefix each table withmsdb.dbo.if running from a different database context. - sysjobhistory growth: On servers with many jobs running frequently,
msdb.dbo.sysjobhistorycan grow large. SQL Server Agent has a built-in history retention setting under SQL Server Agent Properties > History. Keeping a manageable row limit (such as 10,000 rows per job and 100,000 rows total) keeps msdb healthy. - run_date format: The
run_datecolumn insysjobhistorystores dates as integers inYYYYMMDDformat, not as proper date values. TheCONVERT(DATE, CONVERT(VARCHAR(30), run_date))pattern converts this integer to a proper date for comparison and display. - sysjobsteps join multiplies rows: Because the script joins to
dbo.sysjobsteps, a job with multiple steps produces multiple output rows — one per step. If you want one row per job rather than one row per step, add a filter such asAND js.step_id = 1or aggregate the step information.
Practical Tips
To see only jobs that failed in the last 24 hours, add a date filter to the WHERE clause:
1WHERE oa.rownum = 1
2 AND oa.run_status = 0
3 AND CONVERT(DATE, CONVERT(VARCHAR(30), oa.run_date)) >= CAST(DATEADD(DAY, -1, GETDATE()) AS DATE)
To quickly list all currently running jobs without the CTE overhead, use the simpler msdb.dbo.sp_help_job stored procedure with @execution_status = 1 (executing) or query msdb.dbo.sysjobactivity directly.
For a summary report that shows one row per job (not per step), remove the join to dbo.sysjobsteps and dbo.sysproxies and adjust the SELECT list accordingly. This is useful for a quick pass/fail dashboard across all jobs.
Schedule this script as a SQL Server Agent job that runs each morning and writes results to a reporting table. Send an alert email when any row in that table has run_status = 'Failed' and Last60StatusCount > 1 — meaning the job has failed more than once recently and is not a one-off.
Conclusion
This script from Jason Brimhall provides a thorough view of SQL Server Agent job history by combining the most recent run status with 60-day and lifetime failure counts, plus job step details. The CTE-based structure makes it easy to adapt — uncomment the failure filter for a quick failure report, or adjust the 60-day window to match your monitoring period. Running it in msdb on a regular schedule gives DBAs a reliable way to catch recurring job failures before they escalate into data loss or missed SLA events.
References
- sqlserver-kit on GitHub by Konstantin Taranov — Community collection of SQL Server scripts including the Jobs_Detailed_History.sql script used as the basis for this article
- Detail Job History — Back to Basics by Jason Brimhall — Original article by the script author explaining the design and intent of the query
- Microsoft Docs: msdb.dbo.sysjobhistory — System table reference covering all columns in the agent job history catalog
- Microsoft Docs: msdb.dbo.sysjobs — Reference for the system table that stores SQL Server Agent job definitions