Monitoring Scenarios

Published : January 1, 2005

The Microsoft® Systems Management Server (SMS) 2003 provides new and improved capabilities for SMS 2003 monitoring. The following sections describe the primary scenarios that the SMS 2003 Management Pack supports, along with a description of its features.

On This Page

Service Availability
Service Discovery
Availability Monitoring
State Monitoring
Agentless Monitoring Support
Performance Threshold Monitoring
Performance Monitoring
Critical SMS Status Message Monitoring
Site System Status Monitoring
Executive Crash Dump Monitoring
Backlog Monitoring
General Health Monitoring
SMS Patch Management Monitoring
SMS Site Backup Monitoring
Product Knowledge Content for All Alerts
Computer Groups for All SMS 2003 Server and Client Roles
Expanded Monitoring
Software Distribution Program Run Status

Service Availability

Service availability monitors the availability of SMS 2003 services by using both the Service Availability feature of Microsoft Operations Manager (MOM) 2005 and the Service Control Manager events in the NT Event System Log. Service availability monitoring generates:

  • An alert when SMS-based services stop on SMS site systems.

  • An alert when a critical SMS dependent service, such as Windows Management Instrumentation (WMI), stops or fails to start on SMS site systems.

  • An alert when other SMS dependent services, such as Microsoft SQL Server™, Internet Information Services (IIS), and Background Intelligent Transfer Service (BITS) stop or fail to start on SMS site systems.

  • An alert when the SMS Site Backup service is started and stopped, so you can monitor backup durations and correlate stop events from other SMS services to SMS Site Backup.

Service Discovery

Service discovery is the process of discovering data, server roles, and components on managed computers. Service discovery returns information about SMS servers and SMS Advanced Clients. It also supports the presentation of the SMS servers and SMS client roles in the state monitoring view.

Server-specific service discovery data is collected and begins to display in the MOM Operator console as the Management Pack is deployed. Features that require identifying roles, computer groups, and target computers for specific tasks are not available until after service discovery data is collected for the first time.

The SMS 2003 Management Pack uses the following event processing rules to launch service discovery scripts:

  • SMS Clients\SMS Clients - Advanced\SMS 2003 Service Discovery

  • SMS Servers\SMS Servers - Common\SMS 2003 Service Discovery

The Service Discovery scripts listed below are located in the Scripts node of the MOM Administrator console.

  • SMS 2003 Service Discovery - Client.vbs

  • SMS 2003 Service Discovery - Server.vbs

The scripts use an asynchronous timed provider to collect service discovery data once every 24 hours by default. The provider also runs following:

  • Deployment of the Management Pack

  • Configuration changes to the Management Pack

  • Starting and stopping the MOM agent

Availability Monitoring

In this release of the SMS Management Pack, two methods of determining server availability are used: Site System Status Summarizer-based and component health-based.

A server availability script, based on Site System Status Summarizer “site system down” status, was provided in the SMS 2003 Management Pack for MOM 2000 SP1. The script created a stored procedure in the OnePoint database. An updated version of the SMS 2003 Server Availability script is available in the MOM 2005 Management Pack for SMS 2003. Its algorithm has been updated and incorporated into the Server Availability report.

Also included in this Management Pack release is a component health–based availability report, named the Management Point Availability report. Management point availability is based on a new algorithm, new rules, and the SMS 2003 Monitor Management Point Availability script.

All reports, including the new reports referenced in this section, run against the SystemCenterReporting database, which allows you to specify longer reporting time durations than the prior script, which created the stored procedures in the OnePoint database only.

For more information about server availability, see the Scripts and Reports sections in this guide.

State Monitoring

State monitoring provides a graphic representation of the health of all servers or servers scoped to specific computer groups. The Management Pack supports this feature with a granular list of computer groups. Server roles and components are managed through service discovery and component health is managed by a set of rules for service availability, performance thresholds and Management Point health.

The table below shows the health states that are monitored for SMS components by color, which maps to health states as described in Table 1.

Table 1   SMS Server State Monitoring

SMS Component

Green

(Successful)

Yellow

(Warning)

Red

(Critical)

Site server (SS)

Management point (MP)

Site database (DB)

Distribution point (DP)

  

Server locator point (SLP)

 

Sender

 

Reporting point (RP)

 

Provider

 

Client access point (CAP)

 

Performance

note.gif  Note
There are no monitoring rules that will change the state of the distribution point component, so if the component is present, it is reported in a green state only. If a distribution point component is not present, state health is not represented in the console.

Table 2   SMS Client State Monitoring

SMS Component

Green

(Successful)

Yellow

(Warning)

Red

(Critical)

Client service

 

Agentless Monitoring Support

The SMS 2003 Management Pack supports monitoring on agentless managed computers except for the features listed below:

  • Critical SMS status message monitoring

  • Site system status monitoring

  • Topology

The scripts that support the above features are:

  • SMS 2003 Monitor SMS Status Messages

  • SMS 2003 Monitor Site System Summarizer

  • SMS 2003 Server Topology: Discovery

These scripts must run locally against the SMS Provider to access the SMS namespace. If the scripts cannot locate the SMS namespace, a script error event occurs.

Performance Threshold Monitoring

Reduced performance levels might create delays and affect customer service level agreements (SLAs) in terms of time, performance, and data accuracy. It is important to detect and resolve critical performance problems in the SMS infrastructure. Performance threshold monitoring generates an alert when:

  • General computer-wide health metrics relating to CPU, paging file, and memory on SMS site systems exceed certain thresholds.

  • The total number of Microsoft SQL Server™ user connections exceeds a threshold.

  • The total number of SMS files [such as scheduler jobs; send requests; software metering records; discovery data records (DDRs); and software inventory and hardware inventory records] exceeds certain thresholds in the SMS site server inboxes.

  • The total number of SMS messages relating to status messages, DDRs, software inventory, and hardware inventory in the SMS management point message queues exceed certain thresholds.

Performance Monitoring

Performance monitoring enables you to watch for trends in backlogs, processing rates, and total counts of key SMS objects to help identify and proactively resolve problems. Performance trending monitors the following trends graphically, by using MOM 2005 Views and Reports:

  • Backlog files in inboxes relating to scheduler jobs, send requests, software metering records, DDRs, hardware inventory, software inventory, and status messages on SMS site servers.

  • Processing rates of scheduler jobs, send requests, software metering records, DDRs, hardware inventory, software inventory, and status messages on SMS site servers.

  • Total number of scheduler jobs, send requests, software metering records, DDRs, hardware inventory, software inventory, and status messages on SMS site servers.

  • Backlogs of messages relating to DDRs, hardware inventory, software inventory, software metering, status messages, and other SMS objects on SMS management point server message queues.

  • Processing rates of DDRs, hardware inventory, software inventory, software metering, status messages, and other SMS objects on SMS management point server message queues.

  • Totals of DDRs, hardware inventory, software inventory, relays, and status messages on SMS management point server message queues.

  • Total number of SQL Server user connections on SMS site database servers.

  • Computer-wide health metrics relating to CPU, paging file, and memory on SMS site systems.

  • Total number of SMS Executive threads that are running.

  • Rate of ISAPI extension requests to IIS.

Critical SMS Status Message Monitoring

The SMS infrastructure generates SMS status messages. It is important to identify the critical server-related SMS status messages and resolve them in a timely manner. These typically indicate problems in the availability, performance, configuration, and overall health of the SMS infrastructure.

Critical SMS status message monitoring generates an alert when status messages critical to SMS site role health are detected (such as component installation, database connectivity, and site connectivity). The SMS site database is scanned for critical SMS status messages once every half hour.

Site System Status Monitoring

SMS monitors the basic health of its site systems periodically through its site system status summarizer, which polls SMS site systems once every hour, on the hour. To benefit from this feature, this status must be acted on quickly to prevent degradation of SMS functionality. Site system status monitoring provides the ability to monitor and generate an alert when SMS site system summarizer status critical to the health of the SMS site system is detected. Some of the monitored status messages include an SMS site system being down, a server running SQL Server that is running out of database space or log space, and SMS site systems running out of physical disk space.

Executive Crash Dump Monitoring

SMS Executive service failures impact SMS availability and it is important to know when crash dumps occur and to investigate the root causes. When the SMS Executive service fails, it creates a crash dump log subdirectory under the SMS\Logs\CrashDumps directory. Creation of this subdirectory is monitored, and an alert is generated for any SMS server role that depends on the SMS Executive service.

The SMS 2003 Monitor SMS Executive Crash Dumps script reports only the most recent crash dump. This applies no matter how long ago the last crash dump occurred, from the first time that the script is run to anytime SMS Executive is uninstalled and later reinstalled.

The related rules exist under the Microsoft SMS 2003 Component Servers based on SMS Executive rule group:

  • SMS 2003 Crash Dumps: Monitoring SMS Executive

    SMS 2003 Crash Dumps: Monitoring SMS Executive launches the script using the SMS 2003 Schedule every 60 minutes synchronize at 00:10 timed event provider. Summary information is provided on the Knowledge property page of the rule.

  • SMS 2003 Crash Dumps: Monitoring SMS Executive script error

    SMS 2003 Crash Dumps: Monitoring SMS Executive script error monitors for and alerts on any script error events that might be raised. There is only one script error event, number 1102, which reports all script errors. Information about resolution of the alert is provided on the Knowledge property page of the rule.

  • SMS 2003 Crash Dumps: SMS Executive crash

    SMS 2003 Crash Dumps: SMS Executive crash monitors for and alerts on the crash dump event. Information about resolution of the alert is provided on the Knowledge property page of the rule.

Backlog Monitoring

Backlogs create delays that affect customer service level agreements (SLAs) in terms of time, performance, and data accuracy. It is important to promptly detect and resolve critical performance problems in the SMS infrastructure. An alert is generated when a backlog is above a threshold over a given period of time for:

  • The total number of SMS files [such as scheduler jobs; send requests; software metering records; discovery data records (DDRs); and software inventory and hardware inventory records] in SMS site server inboxes. For more information, see the Inbox Monitoring section in this guide.

  • The total number of SMS objects [such as scheduler jobs; send requests; software metering records; discovery data records (DDRs); and software inventory and hardware inventory records] in SMS site server queues.

  • The total number of SMS messages relating to status messages, DDRs, software inventory, and hardware inventory in SMS management point message queues.

General Health Monitoring

General health monitoring generates an alert when a health metric on an SMS server is above a certain threshold over a period of time. General health monitoring checks the following metrics:

  • Processor Time at 95% over three hours

    • Smsexec

    • Ccmexec

    • Total

  • Paging File Usage at 98% over three hours

SMS Patch Management Monitoring

SMS Administrators depend on SMS patch management features to keep client and server computers in their enterprises current with the most recent security updates. Patch management monitoring detects any failure in the SMS patch management processing and configuration. SMS patch management monitoring generates an alert when:

  • The SMS patch management synchronization process fails.

  • The SMS patch management process fails to update SMS distribution points with the updates.

SMS Site Backup Monitoring

Knowing when SMS 2003 Backup started, completed, or failed can help SMS administrators audit backup activities. You can schedule SMS Backup to run periodically to back up the data, policies, and configuration of SMS. It is important to monitor failures of this backup process and to take corrective action. Even when SMS Backup succeeds, it affects service availability because this process stops and starts the SMS Site Component Manager, SMS Executive, and SMS SQL Monitor services. Monitoring SMS Backup helps track known downtimes in these services because alerts about these services stopping is suppressed during the backup process. The following rules support monitoring SMS Site Backup:

  • SMS 2003 service stopped running: SMS_SITE_BACKUP

  • SMS 2003 service started running: SMS_SITE_BACKUP

  • SMS 2003 Status: Site Backup completed successfully

  • SMS 2003 Status: Site Backup failed

Product Knowledge Content for All Alerts

Administrators can reference product knowledge content for all alerts to assist them in specifically identifying a problem and resolving it. To access product knowledge content, right-click any rule name, click Properties, and then click the Product Knowledge tab.

Computer Groups for All SMS 2003 Server and Client Roles

Computer group resolution is now set to specific SMS server and client roles. This provides administrators with greater control in managing rule replication to MOM agents. Administrators can disable the default computer groups or define new computer groups based on the default computer groups and computer attributes.

Expanded Monitoring

The rules provided in the SMS 2003 Management Pack cover basic service, performance measuring, and performance threshold monitoring for the SQL Server, IIS, and Microsoft Windows® 2000, and Windows Server® 2003 Base Operating System Management Packs. You can enable base support in the SMS 2003 Management Pack for these Management Packs, or you can retain the disabled settings and enable the appropriate rules in other Management Packs.

  • Use the SQL Server Management Pack to expand monitoring of SMS site database servers and management points that have SQL Server dependencies.

  • Use the IIS Management Pack to monitor IIS on reporting points, management points, and server locator points. The IIS Management Pack can also monitor distribution points with BITS enabled.

  • Use the Microsoft Windows 2000 and Windows Server 2003 Base Operating System Management Pack to monitor basic Windows components such as CPU, disk, and memory statistics.

There are other Management Packs, for which the SMS 2003 Management Pack does not provide rules, which enhance the monitoring of your SMS 2003 infrastructure.

  • Use the Windows Network Load Balancing Management Pack if you are using Network Load Balancing to support multiple management points in your SMS hierarchy for scalability reasons.

  • Use the Windows Active Directory® Directory Management Pack to monitor the health of your Active Directory infrastructure.

If you deploy multiple Management Packs in your environment, you might receive redundant alerts based on rules that exist in both Management Packs. If you disable redundant monitoring rules, you might be affecting the accuracy of the state view for your deployed Management Packs. You should carefully consider the trade-off between receiving redundant alerts and having state monitoring work for each Management Pack that you deploy. Depending on which additional Management Packs you have deployed, you should review these redundant rules to identify any rules that you might want to enable in your environment. After installing these Management Packs, confirm that they are customized as necessary, configured properly, and have the appropriate rules enabled.

Software Distribution Program Run Status

The SMS Advanced Client can report program failures through the Management Pack.

Administrators can select the Generate MOM alert if this program fails check box in the MOM dialog box of the Program Properties page for a software distribution program.

Show: