Key Monitoring Scenarios

Article
10/01/2013

Applies To: Operations Manager 2007, Operations Manager 2007 R2, SQL Server

The SQL Server Management Pack for Operations Manager includes a number of key monitoring scenarios that can be configured as follows.

Note

The list is not intended to be a complete manifest of the management pack functionality.

Monitoring of Custom User Policies (CUP - PBM policies defined by user)

The new monitoring feature was added for SQL Server 2012 based on Policy Based Management capabilities. The state of user defined policies can be monitored if the database is used as facet.

Note

If database is in Restoring state the CUP targeted to that database won’t be monitored.

There are two monitors which reflects state of CUP:

Two state monitor with 'Error' critical state used particularly for reflecting state of Custom User Policies which have Database as Facet and one of the predefined error categories as Policy Category.
Two state monitor with 'Warning' critical state used particularly for reflecting state of Custom User Policies which have Database as Facet and any category except predefined error categories as Policy Category.

Separated configuration for SQL Server 2008 and SQL Server 2008 R2

To configure different monitoring or discovery settings for SQL Server 2008 and SQL Server 2008 R2, apply overrides to predefined computer groups:

SQL Server 2008 DB Engine Group – includes instances of SQL Server 2008 and SQL Server 2008 SP1
SQL Server 2008 R2 DB Engine Group – includes instances of SQL Server 2008 R2

Data file and log file space monitoring

Management pack rules collect total data log and log file free space. You can use reports for reviewing this information across multiple databases and over extended periods of time for such functions as problem management and capacity planning. Management pack monitors provide space monitoring on three levels: data files, file groups and databases.

For more information, see the hierarchy of space monitors and overridable parameters in “Unit Monitors” and “Health Rolls Up Diagram” sections. The following performance counters are provided to support space monitoring:

Database Level
1. Collect Database Total Free Space (in MB and in %)
  
  The amount of space left in database for all files in all file groups for this database in megabytes or in percent. Also includes space left on media hosting files with auto grow enabled.
DB File Group Level
1. Collect DB File Group Free Space (in MB and in %)
  
  The amount of space left in all files for this file group in megabytes or in percent. Also includes space left on media hosting files with auto grow enabled.
2. Collect DB File Group Allocated Space Left (in MB and in %)
  
  The amount of space left in all files for this file group in megabytes or in percent. Does not include space left on media hosting files with auto grow enabled.
SQL DB File Level
1. Collect DB File Free Space (in MB and in %)
  
  The amount of space left in a file in megabytes or in percent. Also includes space left on media hosting files with auto grow enabled.
2. Collect DB File Allocated Space Left (in MB and in %)
  
  The amount of space left in a file in megabytes or in percent. Does not include space left on media hosting files with auto grow enabled.
SQL DB Log File Level
1. Collect DB Log File Free Space (in MB and in %)
  
  The amount of space left in all log files for this database in megabytes or in percent. Also includes space left on media hosting files with auto grow enabled.
2. Collect DB Log File Allocated Space Left (in MB and in %)
  
  The amount of space left in all log files for this database in megabytes or in percent. Does not include space left on media hosting files with auto grow enabled.

By default, space monitoring is enabled for the following levels:

Database
File Group
File

If your environment is sensitive for any extra load, you may consider disabling monitoring on File Group and File level. To disable File Group level monitoring, you should disable the following rules:

Collect DB File Group Allocated Space Left (%)
Collect DB File Group Allocated Space Left (MB)
Collect DB File Group Free Space (%)
Collect DB File Group Free Space (MB)

To disable File level monitoring, you should disable the following rules and monitors:

Rules
- Collect DB File Allocated Space Left (%)
- Collect DB File Allocated Space Left (MB)
- Collect DB File Free Space (%)
- Collect DB File Free Space (MB)
- Collect DB Log File Allocated Space Left (%)
- Collect DB Log File Allocated Space Left (MB)
- Collect DB Log File Free Space (%)
- Collect DB Log File Free Space (MB)
Monitors
- DB File Space Monitor
- DB Log File Space Monitor

Many databases on the same drive

Default space monitoring settings are noisy in environments where data or log files for many databases are located on the same drive and have autogrow “On”. In such environments, an alert for each database is generated when the amount of free space on the hard drive reaches the threshold. To avoid noise, turn off the space monitors for data and log files, and use Base OS Management Pack to monitor space on the hard drive.

Long-running SQL Server Agent jobs

By default, this scenario is fully enabled in the management packs on a per-SQL Server Agent basis. This means that for each monitoring SQL Server Agent, maximum job duration is what is compared against the thresholds, and alerts are raised if any single job runs too long.

In addition, more detailed monitoring, which is on a per-job basis is provided in the management packs but the discoveries for SQL Server Agent jobs are disabled by default. Enable the following object discoveries:

SQL Server 2012: Discover SQL Server 2012 Agent Jobs
SQL Server 2008: Discover SQL Server 2008 Agent Jobs
SQL Server 2005: Discover SQL Server 2005 Agent Jobs

Job failure

To get alerts for failed jobs, enable the rule “A SQL job failed to complete successfully” and make sure that the option "Write to the Windows Application Event Log” “when the job fails” is selected for all jobs you want to monitor.

For more information, see Job Properties / New Job (Notifications Page) in the MSDN Library

Blocking sessions

A monitor periodically queries each database engine instance for a list of active sessions (SPIDs) and checks to see if any long-running blocking is occurring. If blocking is detected and it exceeds the given threshold, then the state is changed and an alert is raised.

You can apply an override to change the time-duration value that is used to determine whether blocking is long-running. The default time-duration value is one minute.

Discovery of SQL Server Database Engine instances

The discovery of stand-alone and clustered instances of the SQL Server Database Engine role across all managed systems can be configured to exclude particular instances of the database engine.

You can apply overrides to the following discoveries to specify an “Exclude List” (in comma-delimited format) of the names of SQL Server Database Engine instances that the discovery should overlook:

SQL Server 2012: Discover SQL Server 2012 Database Engines (Windows Server)
SQL Server 2008: Discover SQL Server 2008 Database Engines (Windows Server)
SQL Server 2005: Discover SQL Server 2005 Database Engines (Windows Server)

Database discovery and state monitoring

For each managed database engine, the databases on it are discovered and monitored using a number of rules and monitors. Information about monitor-based functionality is provided in other areas of this table.

You can apply overrides to the following discoveries to specify an “Exclude List” (in comma-delimited format) of database names that the discovery should overlook.

SQL Server 2012: Discover Databases for a Database Engine
SQL Server 2008: Discover Databases for a Database Engine
SQL Server 2005: Discover Databases for a Database Engine

Some rules and monitors in this management pack use the script GetSQL2005DBSpace.js. When all rules and monitors that use that script have matching parameters, only one instance of the script runs for the entire group of rules and monitors. When any of the rules or monitors that use that script have different parameter values, a separate instance of the script and scripting host will run at the same time for each rule or monitor that has parameter values that differ from the rest. Because GetSQL2005DBSpace.js scans all SQL databases, performance problems can result when it runs frequently.

To prevent performance problems, ensure that the rules and monitors in the following list use the same parameter values. If you override a parameter for any of these rules or monitors, you should apply the same override to the others.

Rules and monitors for SQL Server 2005 are in the following list:

Rules:
- Microsoft.SQLServer.2005.Database.DBSpaceFree.Collection DISABLED
- Microsoft.SQLServer.2005.Database.DBSpaceFreePercent.Collection DISABLED
- Microsoft.SQLServer.2005.Database.DBSize.Collection ENABLED
- Microsoft.SQLServer.2005.Database.TransactionLogSpaceFree.Collection DISABLED
- Microsoft.SQLServer.2005.Database.TransactionLogSpaceFreePercent.Collection DISABLED
- Microsoft.SQLServer.2005.Database.TransactionLogSize.Collection DISABLED
Monitors:
- Microsoft.SQLServer.2005.Database.DBSizePercentMonitor DISABLED
- Microsoft.SQLServer.2005.Database.DBSizeMegabytesMonitor DISABLED
- Microsoft.SQLServer.2005.Database.TransactionLogSizePercentMonitor DISABLED

If you observe performance problems related to any of these rules or monitors, consider changing the Timeout Seconds parameter for all of these rules and monitors to a less frequent interval.

Restart of DB Engine

The availability of DBEngine is monitored by monitor “SQL Server Windows Service” for the object “SQL DB Engine”. This monitor does not reflect the service restart.

To be notified about each restart of DBEngine, you can enable the rule “SQL Server 2008 DB Engine is restarted”/ “SQL Server 2005 DB Engine is restarted”.

SQL DB Engine CPU monitoring

CPU utilization is monitored by the monitor which provides a measure of how much processors actually working on SQL Server’s process threads and raises an alert if all allocated CPUs are busy processing SQL Server tasks. This monitoring scenario takes into account current affinity mask of SQL DB Engine.

DB Storage latency monitoring

DB Storage performance is monitored by two monitors “Disk Read Latency” and “Disk Write Latency”. In case in significant storage performance degradation alert will be raised. These monitors disabled by default. Enable these monitors only for specific DBs where it is required monitor storage performance. In addition latency can be viewed on Database Dashboard.

Disabled Rules

Some rules in the management pack are disabled by default to avoid noise. Consider enabling rules that can be valuable in your environment. The following rules are disabled by default:

An SQL job failed to complete successfully
An SQL Server Service Broker procedure output results
An SNI call failed during a Service Broker/Database Mirroring transport operation
SQL Server restarted
SQL Server Service Broker Manager has shutdown
SQL Server Service Broker or Database Mirroring is running in FIPS compliance mode
SQL Server Service Broker or Database Mirroring Transport stopped
SQL Server Service Broker transmitter shut down due to an exception or a lack of memory
SQL Server terminating because of system shutdown
Table: Creating statistics for the following columns
The Service Broker or Database Mirroring Transport has started
The SQL Server Service Broker or Database Mirroring transport is disabled or not configured

Share via