Appendix: Monitors

Applies To: Operations Manager 2007

The following monitors are part of this management pack.

DB Engine Unit Monitors

The following DB Engine unit monitors are included in this management pack.

SQL SPN Configuration Status

This monitor type checks the state of a Microsoft SQL Server 2012 Database Engine SPN configuration.

A Service Principal Name (SPN) for the SQL Server Database Engine may either be missing, misplaced, or is a duplicate to other SPNs configured in the Active Directory of the domain.

Parameter Default value

Alert On State

Warning

Alert Priority

Normal

Alert Severity

Warning

Auto-Resolve Alert

True

Generates Alert

True

Enabled

True

Interval (sec)

900

Unavailable Time (seconds)

300

SQL Server Windows Service

If the parameter “Alert only if service startup type is automatic” is True and SQL Server windows service startup mode is not automatic, the monitor is always in Green state.

Otherwise, the (Interval) monitor checks the state of the SQL Server windows service every 60 seconds. If the service is not in the “Running” state for 900 seconds (Unavailable Time), the monitor changes state to Critical. If the service is in the “Running” state, the monitor displays the state as Healthy.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Generates Alert

True

Enabled

True

Interval (sec)

60

Unavailable Time (seconds)

900

Alert only if service startup type is automatic

True

SQL Server Full Text Search Service

Standard Windows Service monitor for SQL Server Full Text Search windows service.

Service Pack Compliance

This monitor has a parameter named Good Value, which represents the service pack number under which a SQL Server installation is considered out of compliance. The parameter can be overridden to specify the service pack version for which to check.

The following parameters are overridable.

Parameter Default value

Alert On State

Warning

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Generates Alert

True

Enabled

True

Interval (sec)

43200

Minimal Service Pack level for SQL Server 2005

1

Minimal Service Pack level for SQL Server 2008

1

Minimal Service Pack level for SQL Server 2008 R2

0

Minimal Service Pack level for SQL Server 2012

0

Blocking Sessions

This monitor checks blocking sessions every 300 seconds (Interval). If there is a blocking session that is blocked more than 1 minute (wait time) and the amount of such blocking sessions is more or equal to 1 (Number of blocked session), then the monitor changes state to Critical.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

High

Alert Severity

Critical

Auto-Resolve Alert

True

Generates Alert

True

Enabled

False

Interval (sec)

300

Number of blocked sessions

1

Wait Time (min)

1

Synchronization Time

Timeout (sec)

300

SQL User Connections Performance

This monitor analyses the user connections to the SQL Server Database Engine over time and calculates a baseline over the initial learning period. A warning or error alert is raised if the user connections number moves outside the baseline that has been captured.

The following parameters are overridable.

Parameter Default value

Alert On State

Warning

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Generates Alert

True

Enabled

False

Inner Sensitivity

2.81

Outer Sensitivity

3.31

Average Wait Time

The monitor is designed to check average amount of wait time (in milliseconds) for each lock request that resulted in a wait. If wait time exceeds the given threshold the monitor changes its state and raises an alert.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Enabled

True

Generates Alert

True

Interval (seconds)

900

Threshold

100

Timeout Seconds (seconds)

300

Buffer Cache Hit Ratio

The monitor checks percentage of pages that were found in the buffer pool without having to incur a read from desk. Zero value indicates memory bottleneck and if it is detected the monitor generates an alert.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Enabled

True

Generates Alert

True

Interval (seconds)

900

Threshold

0

Page Life Expectancy engine level monitoring

Page Life Expectancy is number of seconds a page will stay in the buffer pool without references. High page life expectancy means that required data can be found in cache instead of going to hard drive. If the value is extremely low and alert will be raised to notify about the problem.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Enabled

True

Generates Alert

True

Interval (seconds)

900

Threshold

300

CPU Utilization

The monitor provides a measure of how much processors actually working on SQL Server’s process threads and raises an alert if all allocated CPUs are busy processing SQL Server tasks.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Enabled

True

Generates Alert

True

Interval (seconds)

300

Number of samples

6

Threshold

95

Timeout Seconds (seconds)

200

SQL Re-Compilations

Certain changes in a database can cause an execution plan to be either inefficient or invalid, based on the new state of the database. SQL Server detects the changes that invalidate an execution plan and marks the plan as not valid. A new plan must then be recompiled for the next connection that executes the query. If number of recompilations it too high an alert is raised.

The following parameters are overridable.

Parameter Default Value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Enabled

True

Generates Alert

True

Interval (seconds)

300

Number of samples

6

Threshold

10

Timeout Seconds (seconds)

200

Stolen Server Memory

Monitors amount of memory the server is currently using for the purposes other than the database pages and raises an alert if the amount exceeds the threshold.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Enabled

True

Generates Alert

True

Interval (seconds)

300

Number of samples

6

Threshold

70

Timeout Seconds (seconds)

200

Thread Count

Usually SQL Server opens system thread for each query request, but if amount of threads exceeds specified max worker threads value, SQL Server pools the worker threads. When all worker threads are active with long running queries, SQL Server may appear unresponsive until a worker thread completes and becomes available. Though not a defect, this can sometimes be undesirable. The monitor analyzes amount of free threads and notifies if the amount is low.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Enabled

True

Generates Alert

True

Interval (seconds)

300

Number of samples

6

Minimum Free Threads Count

70

Timeout Seconds (seconds)

200

Database Unit Monitors

Database Backup Status

This monitor checks the existence and age of database backup as reported by Microsoft® SQL Server™. This is done by running a query against the master database of the SQL instance and returning the age of the database backup.

Parameter Default value

Alert On State

Error

Alert Priority

Normal

Alert Severity

Error

Auto-Resolve Alert

True

Generates Alert

True

Enabled

False

Interval (sec)

86400

Timeout (seconds)

300

Database Status

Periodically the monitor checks the status of the database as reported by SQL Server. This is done by running a query against the master database of the SQL Server instance and returning the state of the database. If you receive an alert from this monitor, an action is required in order to bring the database back to an operational state.

SQL Server database state Monitor health state

ONLINE

GREEN

OFFLINE

RED

RECOVERY PENDING

RED

SUSPECT

RED

EMERGENCY

RED

RESTORING

YELLOW

RECOVERING

YELLOW

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Generates Alert

True

Enabled

True

Interval (sec)

3600

Synchronization Time

Timeout (seconds)

300

SQL Server Windows Service

This monitor translates the state of the "SQL Server Windows Service" monitor of DB Engine to Database. For more information, see the "SQL Server Windows Service" monitor Knowledge Base article.

The following parameters are overridable.

Parameter Default value

Enabled

True

Generates Alert

False

Auto-Resolve Alert

False

Alert Priority

Low

Alert on State

Critical

Alert Severity

Critical

Interval (sec)

60

Configuration Monitors

Configuration monitors check a configuration value for the database and compare it with the expected value. If they do not match, the monitor turns to warning state; otherwise, the monitor turns to red state.

The following parameters are overridable.

Parameter Default value

Alert on State

Warning

Alert Priority

Medium

Alert Severity

Warning

Auto-Resolve Alert

True

Generates Alert

True

Enabled

False

Expected Value

Depends on monitor

Disable Check for SQL Express

True

Interval (sec)

43200

Timeout (sec)

300

Aggregate Monitor Monitor Expected value Databases

Automatic Configuration

Auto Close Configuration

OFF

User and System

Automatic Configuration

Auto Create Statistics Configuration

ON

User and System

Automatic Configuration

Auto Shrink Configuration

OFF

User and System

Automatic Configuration

Auto Update Statistics Configuration

ON

User and System

Automatic Configuration

Auto Update Statistics Async Configuration

OFF

User and System

External Access Configuration

DB Chaining Configuration

OFF

User

 

Trustworthy

OFF

User

Recovery

Page Verify Configuration

CHECKSUM

User and System

Configuration

Recovery Model Configuration

FULL

User and System

Destination Log Shipping

This monitor detects if a log shipping destination did not have a log restored to it within the threshold defined as a part of the log shipping configuration. When this condition occurs the monitor will change to an error (red) state. Once the log restores resume and are within the defined thresholds then the monitor will return to a success (green) state. By default, this monitor generates alerts when it is in an error state.

The following parameters are overridable.

Parameter Default value

Alert on State

Critical/Warning

Alert Priority

Medium

Alert Severity

Critical/Warning

Auto-Resolve Alert

True

Generates Alert

True

Enabled

True

Source Log Shipping

This monitor detects if a log shipping source has not had its logs backed up within the threshold defined as a part of the log shipping configuration. When this condition occurs the monitor will change to an error state. Once the log backups resume and are within the defined thresholds then the monitor will return to a success (green) state. By default, this monitor generates alerts when it is in an error state.

The following parameters are overridable.

Parameter Default value

Alert on State

Critical/Warning

Alert Priority

Medium

Alert Severity

Critical/Warning

Auto-Resolve Alert

True

Generates Alert

True

Enabled

True

DB Total Space

Monitors the space available on the database and on the media hosting the database. The space available on the media hosting the database is only included as part of the space available if auto grow is enabled for at least one of the files.

Every 900 (Interval) seconds monitor calculates space available on the log files in percentage. If value of less than Lower Threshold (%) monitor changes state to Critical. If value of less than Lower Threshold (%) and greater than Upper Threshold (%) monitor changes state to Warning. Otherwise, monitor saves Healthy state.

The following parameters are overridable.

Parameter Default value

Alert on State

Critical/Warning

Alert Priority

High

Alert Severity

Matches monitor’s health

Auto-Resolve Alert

True

Generates Alert

True

Enabled

False

Interval (sec)

900

Timeout (sec)

300

Synchronization Time

Lower Threshold

20

Upper Threshold

10

DB Total Space Percentage Change

Monitors for a large change in value of database available space over a set number of sample periods.

Every 900 (Interval) seconds the monitor calculates the database available space as a percentage. After 5 (Number of Samples) samples, the monitor compares the value from the first sample and sixth sample. If there is a difference between values greater than 45 % (Upper Threshold), the monitor changes state to Critical. If there is a difference between values less than 45 % (Upper Threshold) and greater 25 % (Lower Threshold), the monitor changes state to Warning. Otherwise, the monitor saves the Healthy state.

The following parameters are overridable.

Parameter Default value

Alert on State

Critical/Warning

Alert Priority

Medium

Alert Severity

Matches monitor’s health

Auto-Resolve Alert

True

Generates Alert

True

Enabled

True

Interval (sec)

900

Timeout (sec)

300

Synchronization Time

Lower Threshold (%)

25

Upper Threshold (%)

45

Number of Samples

5

Disk Read Latency

This monitor checks latency for the disk read operations and throws an alert if the latency exceeds threshold. To avoid noise the monitoring is disabled by default.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Enabled

False

Generates Alert

True

Interval (seconds)

300

Number of samples

6

Threshold

40

Timeout Seconds (seconds)

200

Disk Write Latency

This monitor checks latency for the disk write operations and throws an alert if the latency exceeds threshold. To avoid noise the monitoring is disabled by default.

The following parameters are overridable.

Parameter Default value

Alert On State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Enabled

False

Generates Alert

True

Interval (seconds)

300

Number of samples

6

Threshold

25

Timeout Seconds (seconds)

200

DB File Unit Monitors

DB File Space

Monitors the space available in a file and on the media hosting the file. The space available on the media hosting the files is only included as part of the space available if auto grow is enabled for this file.

Every 900 (Interval) seconds, the monitor calculates the space available on the log files in percentages. If the value is less than the Lower Threshold (%), the monitor changes state to Critical. If value is less than the Lower Threshold (%) and greater than the Upper Threshold (%), the monitor changes state to Warning. Otherwise, the monitor saves the Healthy state.

The following parameters are overridable.

Parameter Default value

Alert on State

Critical/Warning

Alert Priority

Low

Alert Severity

Matches monitor’s health

Auto-Resolve Alert

False

Generates Alert

False

Enabled

True

Interval (sec)

900

Timeout (sec)

300

Synchronization Time

Lower Threshold (%)

20

Upper Threshold (%)

10

DB Log File Unit Monitors

DB Log File Space

Monitors the space available on the log files and on the media hosting the log files. The space available on the media hosting the log files are only included as part of the space available if auto grow is enabled for at least one of the files.

Every 900 (Interval) seconds, the monitor calculates the space available on the log files in percentages. If the value is less than the Lower Threshold (%), the monitor changes state to Critical. If the value is less than the Lower Threshold (%) and greater than the Upper Threshold (%), the monitor changes state to Warning. Otherwise, the monitor saves a Healthy state.

The following parameters are overridable.

Parameter Default value

Alert on State

Critical

Alert Priority

Low

Alert Severity

Critical

Auto-Resolve Alert

False

Generates Alert

False

Enabled

True

Interval (sec)

900

Timeout (sec)

300

Synchronization Time

Lower Threshold (%)

20

Upper Threshold (%)

10

Agent Unit Monitors

SQL Server Agent Windows Service

Every minute, this monitor checks the state of the SQL Server Agent windows service. If the service is not in “Running” state, the monitor changes state to Critical. If the service is in “Running” state, the monitor changes state to Healthy.

The following parameters are overridable.

Parameter Default value

Alert on State

Critical

Alert Priority

Medium

Alert Severity

Critical

Auto-Resolve Alert

True

Generates Alert

True

Enabled

True

Alert only if service startup type is automatic

True

Long Running Jobs

Every 600 (Interval) seconds, the monitor verifies all currently executed jobs. If for at least one job its execution time exceeds the lower or upper threshold, the monitor changes state to warning or critical accordingly. If all execution times are lower than thresholds, the monitor changes state to healthy.

The following parameters are overridable.

Parameter Default value

Alert on State

Critical/Warning

Alert Priority

Medium

Alert Severity

Matches monitor’s health

Auto-Resolve Alert

True

Generates Alert

True

Enabled

False

Interval (sec)

600

Timeout (sec)

300

Synchronization Time

Lower Threshold (minutes)

60

Upper Threshold (minutes)

120

Agent Job Unit Monitors

Last Run Status

Every 600 (Interval) seconds, the monitor verifies the last run status of the job. If the last run status equals failed, the monitor changes state to Critical. If the job hasn’t finished yet and does not have a last run status, the monitor does not change state.

The following parameters are overridable.

Parameter Default value

Generates Alert

False

Enabled

True

Interval (sec)

600

Timeout (sec)

300

Synchronization Time

Job Duration

Every 600 Interval seconds, the monitor verifies the execution time of the job. If the execution time exceeds the lower or upper threshold, the monitor changes state to warning or critical accordingly. If execution times are lower than thresholds, the monitor changes state to healthy.

The following parameters are overridable.

Parameter Default value

Generates Alert

False

Enabled

True

Interval (sec)

600

Timeout (sec)

300

Synchronization Time

Lower Threshold (minutes)

60

Upper Threshold (minutes)

120

Analysis Services Unit Monitors

SQL Server Analysis Services Windows Service

Standard Windows Service monitor for SQL Server Analysis Services windows service.

Integration Services Unit Monitors

SQL Server Integration Services Windows Service

Standard Windows Service monitor for SQL Server Integration Services windows service.

Reporting Services Unit Monitors

SQL Server Reporting Services Windows Service

Standard Windows Service monitor for SQL Server Reporting Services windows service.