Scripts for SMS 2003

The SMS 2003 Management Pack contains Operations Manager 2007 scripts that run automatically on the agent and scripts that need to be run by an administrator. Additional information about some of these scripts follows in this section.

The following scripts are run by the Operations Manager 2007 agent automatically:

  • SMS 2003 Monitor SMS Status Messages

  • SMS 2003 Monitor Site System Summarizer

  • SMS 2003 Service Discovery – Client

  • SMS 2003 Service Discovery – Server

  • SMS 2003 Site Backup Event Suppression

  • SMS 2003 Monitor Management Point Health

  • SMS 2003: Monitor Management Point Availability

  • SMS 2003 Site Hierarchy Discovery

  • SMS 2003 Monitor SMS Inbox

  • SMS 2003 Monitor SMS Executive Crash Dumps

SMS 2003 Monitor SMS Status Messages

Many problems in SMS are detected and reported internally through the SMS Status System. Status messages are raised by SMS components pertaining to conditions of interest to an SMS administrator.

The status messages flow into the database for an SMS site and also up the site hierarchy into the parent site’s database. It is possible to monitor the SMS status system on a per-site basis, simply by querying the SMS status message table of a specified site’s database.

A set of status messages have been chosen for monitoring that reflect the most critical events to SMS administrators. When one of these messages is found in the database, an alert is generated.

This script does the following:

  • Checks if it is running under agentless-managed mode. If so, the script will exit without taking any further action.

  • Initializes global variables.

  • Opens and reads the RecordID of the last status message processed for each SMS site database being monitored from the SMS 2003 Monitor SMS Status Messages.VarSet file.

  • Processes status messages found in each SMS Site Database on the local computer.

  • Writes the RecordID of the last status message processed, for each SMS Site Database being monitored, to the SMS 2003 Monitor SMS Status Messages.VarSet file and closes the file.

The SMS 2003 Monitor SMS Status Messages script raises events with an Event ID set to the SMS Status Message ID being reported on. Table S.1 describes the format of the raised event.

Table S.1   Format of Events Raised by SMS 2003 Monitor SMS Status Messages Script

Parameter

Description

EventType

Success, Error, Warning, or Information, depending on the SMS status message severity

EventNumber

SMS Status Message ID

Category

SMS site code

LoggingDomain

Resource domain of computer

LoggingComputer

NetBIOS name of computer

Parameter1

NetBIOS name of computer

Parameter2

SMS module name—for example, SMS Server

Parameter3

SMS component name

Parameter4

SMS Message ID

Parameter5

Win32 error code

Parameter6

Parent or top-level site code

Parameter7

RecordID of status message in the SMS Status Message Table

Description

"A new monitored SMS status message on machine <NetBIOSName> from component <Component Name> with message ID <Message ID> was found in the site <Site Code> database.”

The SMS 2003 Monitor SMS Status Messages script also reports script errors through Status Message ID number 1102:

EventType= Warning

EventNumber= 1102

Description = The script SMS 2003 Monitor SMS Status Messages running under processing rule <ScriptProcessingRuleName> encountered a runtime error. The error message will contain the text “Failed to <ErrorDescription>.”

All SMS 2003 status message monitoring event rules begin with the phrase “SMS 2003 Status:” and are located in the SMS Site Database Servers rule group.

  • SMS 2003 Status: Message Monitoring

SMS 2003 Status: Message Monitoring launches the script using the SMS 2003 Schedule every 30 minutes synchronize at 00:02 timed event provider. Summary information is provided on the Knowledge property page of the rule.

  • SMS 2003 Status: Script error

SMS 2003 Status: Script error monitors for and alerts on any script error events that might be raised. There is only one script error event, number 1102, which reports all script errors. Information about resolution of the alert is provided on the Knowledge property page of the rule.

  • SMS 2003 Status: <status message name>

SMS 2003 Status: <status message name> monitors for and alerts on specific status message events. Information about resolution of the alert is provided on the Knowledge property page of the rule.

All alerts have a description of the event in the following format, where $Description$ is the event description:

$Logging Computer$ - "Rule Name". $Description$

The SMS 2003 Monitor SMS Status Messages VarSet file is used to persist the RecordID of the last status message processed, on a per-site-database basis. The VarSet file is located in the Windows\Temp directory. It is a tab-delimited file with the following format:

LastRecordID_DBNameRecordID#

For example, LastRecordID_SMS_FLA, with a RecordID of 477860 would indicate that the last RecordID processed from the site database named SMS_FLA was number 477860.

When upgrading the management pack from a previous version, the following will be found in the SMS 2003 Monitor SMS Status Messages VarSet file:

LastRecordID476743
LastRecordID_SMS_FLA477860

The first line, which does not contain a database name, indicates the original version of status message monitoring that did not support multiple database monitoring. The second line, which does contain a database name, SMS_FLA, indicates that the current version of status message monitoring does support multiple database monitoring.

No action will be taken to utilize the first line of data, because in the multiple site database case it is difficult to determine to which database it would correspond. The first line of data is not removed from the VarSet file, because it can be of value in evaluating the new changes and insuring that status message monitoring begins with the correct RecordID after the upgrade.

SMS 2003 Monitor Site System Summarizer

Site System Summarizer

The Site System Status Summarizer is a component thread of the SMS Executive service that maintains status on all defined SMS site systems. By default, it polls once every hour, on the hour, for current status. This polling interval is defined in the SMS Site Control File. It is recommended that the polling interval not be changed. Status is maintained in a table in the SMS site database.

Site System DownSince Status

The Site System Status Summarizer assumes that a site system is down when it cannot contact the site system during one of its polling intervals. This can be due to one of the following factors:

  • The site system is not turned on, not connected to the network, or not functioning properly.

  • The SMS Site System Status Summarizer does not have a connection because no connections are available.

  • The SMS Site System Status Summarizer does not have sufficient access rights to connect to the site system.

  • Network problems are preventing the SMS Site System Status Summarizer from connecting to the site system.

  • The site system has been permanently taken out of service.

Site System Status Summarizer-Based Availability

Site System Status Summarizer status is used for availability reporting by the SMS 2003 Server Availability Script and SMS 2003 Server Availability Report. SMS Server availability measurements utilize the DownSince registry value for determining whether or not a server is available.

Limitations of Site System Status Summarizer-Based Availability

This section explains the limitations of using Site System Summarizer status for availability reporting by the SMS 2003 Server Availability Script and SMS 2003 Server Availability Report. SMS Server availability measurements utilize "DownSince" status for determining whether or not a server is available.

Site System Status Summarizer–based availability monitoring has the following limitations:

  • The Site System Summarizer polling interval is one hour. This interval allows sufficient time for the Site System Status Summarizer to poll all the site systems. For performance reasons, the interval cannot be changed. The summarizer’s time-out interval is controlled by the Startup Schedule property in the Site Control File, not by the Wakeup Interval property. This means that changing the Wakeup Interval property from its default value of 60 has no effect. In addition, changing the Startup Schedule has no effect. By default, it is set for every 15 minutes, and the Site System Status Summarizer polls site systems once per hour, on the hour, regardless of how the Startup Schedule is set.

  • The availability methodology considers a site unavailable only if two successive SMS 2003 Site System Summarizer: Site System is possibly down events occur over a period of two hours. This is designed to account for temporary network outages. This means your availability is assessed a two-hour service outage, even if the site has been unavailable for a small amount of time.

  • SMS can be unavailable in many ways that the Site System Status Summarizer does not monitor. For example, the Site System Status Summarizer does not consider SMS unavailable if SMS services are not running or the SMS site database is unavailable, as when the SQL Server service is stopped.

  • Availability data is impacted for computers hosting multiple SMS server roles. Downtime is calculated based on alerts generated for a site system being down. A single alert has a two-hour downtime weighting. The Site System Status Summarizer writes the current "DownSince" status for each site object and site system role hosted on a computer to the database, in the Site System Status Summarizer table.

The Site System Status Summarizer monitoring script examines each site system role in the Site System Status Summarizer table. If a site system is marked as down, it raises an event. An alert is subsequently generated based on that event. Multiple roles can generate multiple alerts, which will create extended downtime durations. This can be reflected as negative availability percentages, depending upon the interval of interest.

Each SMS server role, except for an SMS distribution point, establishes two SMS server roles by default, one for the specific role and the other for the SMS component server. For example, in the case of a site system hosting both a server locator point and a reporting point, if the server was offline for more than two hours (two Site System Status Summarizer polling intervals), the Site System Status Summarizer would mark both of these roles, and their SMS component server roles, as down.

However, only three alerts, one for each server role and one for the physical component server hosting two server roles, would be raised per polling interval for the two-hour period. This would produce a negative availability result because six hours would be marked as unavailable for the two-hour time period.

  • The 60-minute schedule, offset from the hour by 10 minutes, has been set as default so that the SMS 2003 Monitor Site System Summarizer script runs 10 minutes after the beginning of the Site System Status Summarizer polling interval of 1 hour. If the summarizer is taking longer than 10 minutes to complete its cycle, the timed event provider offset should be increased appropriately. Otherwise, the most current site system status messages will be missed.

SMS 2003: Monitor Management Point Availability

The SMS 2003 Monitor Management Point Availability script supports the Management Point Availability Report. Availability is based on the health state of a management point and leverages the following health state value found in the registry:

DWORD value HKEY_LOCAL_MACHINE\Microsoft\SMS\MP\MPHealthState

Based on a configurable schedule and reporting interval, the script reads and then reports, through a Operations Manager 2007 event, the duration that the state is unhealthy (registry value is 1, not 0).

These availability data events are written to the OperationsManager database and to the OperationsManagerDW database. A Operations Manager 2007 report can then be run that displays the total time that the Management Point was down during a specified interval along with an availability percentage.

The MPHealthState registry value is maintained by the MP Control Manager. It is updated every six minutes.

This script performs the following actions:

  • Checks whether it is running under agentless managed mode. If so, the script exits without taking further action. The script uses a local VarSet file for persisting variables, so the Operations Manager 2007 agent must be installed.

  • Initializes global variables.

  • Obtains the input parameters ScheduledInterval and ReportingInterval, in minutes.

  • Verifies that the input parameters are not empty, 0, or negative. If they are, a script error event is raised, indicating that a non-valid value was specified. The script also verifies that the ReportingInterval value is not less than the ScheduledInterval value. If it is, a script error event is raised indicating that ReportingInterval cannot be less than ScheduledInterval. Both values implement a Lng type and have a maximum value of 2147483647 (2G -1).

  • Opens and copies the contents of the SMS 2003 Monitor Management Point Availability.VarSet file into a local collection. If the file does not exist, no script error event is raised. If the file does exist but can’t be opened, a script error event is raised.

  • Checks the availability of the Management Point.

    • If the current SMS version is less than 2.50.3067.0000, the script exits with no error raised because the MPHealthState registry value is not present. The Agent Response log file, AgentResponses-Configuration Group Name.log, will contain the statement "Monitoring Management Point health is not supported on this SMS build." Logging is enabled by setting the EnableActiveDebugging DWORD value, under the HKEY_LOCAL_MACHINE\SOFTWARE\Mission Critical Software\OperationsManager key, to 1.

    • Obtains the current Management Point Health State from registry DWORD value HKEY_LOCAL_MACHINE\Microsoft\SMS\MP\MPHealthState. If the value cannot be obtained, the script exits with no error raised because the MP Control Manager has not yet created it.

    • Obtains the Start Date and Time from the VarSet collection. If this is the first time the script has run and it is the start of a new reporting period, there is no entry. The script initializes the duration and the start date, in the VarSet collection, before exiting with no error raised. It sets the duration to 0 and the start date to the current date and time.

    • For the current reporting interval, if the MPHealthState registry value is 1, the script increments the unavailable duration in the VarSet collection, based on the specified schedule interval.

    • Reports, by creating an availability data event, if the reporting interval has elapsed and the management point has been unavailable. The script resets the duration to 0 and resets the start date in the VarSet collection to the current local time for the next reporting period.

    • If an error occurs while checking the availability of the Management Point, a script error is raised.

Event Number = 1102

Event Type = Warning

Message = "The script ‘SMS 2003 Monitor Management Availability' running under processing rule <ProcessingRuleName> encountered a runtime error." CrLf "Failed to check Management Point availability.” <ErrorString>

  • Writes the contents of the VarSet collection to the local SMS 2003 Monitor Management Point Availability.VarSet file and closes the file. If the file does not exist, it is created. If no VarSet variables were set, no entries for them appear in the file. If an error occurs writing the file, the script raises a script error.

SMS 2003: Monitor SMS Inbox

SMS does most of its key server-side activities by reading from and writing to files in its inboxes. When SMS inboxes get unusually full, it’s a strong indication that SMS processing is falling behind. This could be for a variety of reasons, including but not limited to bad files, stopped processes, lack of disk space, or unusual software distribution activity level. Such problems are fairly rare, but serious for customers with a substantial number of sites.

A number of the performance counters for SMS inboxes measure the number of items queued by the service component and not the actual number of items waiting to be processed on disk. The SMS service component will update the performance counter after working through its current queue. This might cause the performance counters to update very infrequently, preventing timely detection of serious backlogs. The new SMS 2003: Monitor SMS Inbox script adds direct monitoring by counting files in critical SMS server inboxes and alerts the administrator through Operations Manager 2007 when the number of files in a particular inbox exceeds a user configurable threshold value.

All rules are located in the SMS Site Servers – Common processing rule group, which targets these rules at central, primary, and secondary sites. The threshold passed to the script can be configured by means of an override.

Due to performance considerations, these rules are not targeted at management points or any other SMS site system roles.

Schedule and Optional Configuration

To minimize the performance impact of inbox monitoring, the pertinent rules will be synchronized to run once per day. The rules are staggered at 15 minute intervals, beginning at 01:00 AM, so that only one rule runs at a time. For example, the first rule counts the files present in the first inbox at 01:00 AM. The next rule counts the files present in the second inbox at 01:15 AM, and so forth, until all inboxes have been counted.

You can modify this schedule following the usual steps you take in scheduling rule script execution in Operations Manager 2007. No initial configuration is required to activate this feature. Following are optional configuration changes you can perform:

  • Change the default threshold value for each inbox.

  • Change the schedule for individual inbox rules.

These optional configuration changes can be made by following the usual steps to configure threshold event rules in any management pack.

Threshold Alerts and Values

One alert is generated whenever the configured threshold value for a specific inbox rule is reached each time the monitoring rule is processed. A script error alert is generated whenever the script encounters a non-valid threshold value or an error attempting to monitor the count of files in each inbox.

To edit the performance threshold values

  1. In the Operations Manager 2007 Administrator console, click Authoring.

  2. In the Authoring pane, navigate Authoring / Management Pack Objects / Monitors.

  3. In the Monitors pane, change the scope to SMS server.

  4. Navigate SMS Server / Entity Health / MOM 2005 Computer Role Health, and then expand the server role to be edited.

    • DB - Database server

    • MP - Management point server

    • SS - Site server

  5. Double-click the performance threshold monitor rule to be edited. The Performance Threshold Properties dialog box opens.

  6. Select the Configuration tab.

  7. Click Edit…. The Xml Configuration dialog box opens.

  8. In the XML text, edit the threshold value to be modified.

    • Specify the <ValueExpression> nested in <CriticalErrorExpression> to change the critical error threshold.

    • Specify the <ValueExpression> nested in <WarningExpression> to change the warning threshold.

  9. Click OK. Click OK. The modified threshold values are distributed to the servers.

The default threshold value for each inbox monitoring rule is set to 10,000. You should set threshold values for the FileCountThreshold parameter of the SMS 2003: Monitor SMS Inbox script as appropriate in your specific environment. Valid threshold values can be from 1 to 2147483647 (0x7FFFFFFF), inclusive. Invalid threshold values are zero (0) or numbers greater than 2147483647, as these will produce negative numbers in the 32-bit range.

For a description of inbox folders and how they are used in Systems Management Server, see the relevant topic at Microsoft Help and Support.

SMS 2003 Monitor SMS Executive Crash Dumps Script

This script is located in the Scripts folder of the SMS Administrator console and is named SMS 2003 Monitor SMS Executive Crash Dumps. The following actions are taken by this script:

  • Determines whether the target computer is running in agentless mode. If so, the script exits without further action. This script uses a local VarSet file for persisting variables, so the Operations Manager 2007 agent must be installed on all target computers.

  • Opens and reads the contents of the local SMS 2003 Monitor SMS Executive Crash Dumps.VarSet file into a collection. If the file does not exist, no script error event is raised. If the file does exist but can’t be opened, a script error event is raised, with a message containing the following text:

Failed to load script variables.

  • Checks whether a new crash dump has occurred since the last time a check was performed. If so, the script raises an event to report the new crash dump. To raise the script error event, the script does the following:

    • Obtains the SMS Installation Directory from the registry under the SMS Identification key, registry value "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SMS\Identification\ Installation Directory". If the path cannot be obtained, the script raises a script error event message with the following text and exits:

Failed to read registry value.

  • Constructs the full path for the CrashDumps directory by appending "\Logs\CrashDumps" to the retrieved SMS Installation Directory.

  • Checks whether the CrashDumps directory exists. If it doesn’t, the script logs that the folder’s existence could not be verified and exits without further action. The Agent Response log file, AgentResponses-Configuration Group Name.log, will contain the statement:

Failed to verify existence of crash dump directory: <CrashDumpsPath>.

Logging is enabled by setting the EnableActiveDebugging DWORD value under the HKEY_LOCAL_MACHINE\SOFTWARE\Mission Critical Software\OperationsManager key to a value of 1.

  • Obtains the last crash dump folder creation date from the VarSet collection.

  • Checks whether new crash dumps were created and acts only on the most recent one. To determine which is most recent, the saved folder creation date of the last crash dump is used, if available. Date and time comparison is based on folder creation to a resolution of one second.

  • If a new crash dump occurred or if it is the first time that the script has run, the script logs and creates event 1710, specifying the new crash dump folder and path. The script saves the folder creation date of the crash dump in the VarSet collection. If a crash dump did not occur, that is logged.

  • Writes the contents of the VarSet collection to the local SMS 2003 Monitor SMS Executive Crash Dumps.VarSet file and closes the file. If the file does not exist, it is created. If no VarSet variables were set, no entries for them appear in the file. If an error occurs writing the file, a script error event message with the following text is raised: Failed to save script variables.

SMS 2003 Monitoring SMS Executive Crash Dump Event

The SMS 2003 Monitor SMS Executive Crash Dumps script reports through event 1710 that a crash has occurred, as described below:

EventType = Error

EventNumber = 1710

Category = SMS Site Code

Description = SMS Executive in site “XXX” has crashed. For details, see crash dump information under <Drive> \SMS\Logs\CrashDumps\NewFolder.

SMS 2003 Monitoring SMS Executive Crash Dumps Script Error Event

The “SMS 2003 Monitor SMS Executive Crash Dumps” script reports runtime script errors through event 1102.

EventType = Warning

EventNumber = 1102

Description = The script “SMS 2003 Monitor SMS Executive Crash Dumps” running under processing rule SMS 2003 Crash Dumps: Monitoring SMS Executive encountered a runtime error. CrLf “Failed to <Message>." ErrorString

SMS 2003 Server Availability Script

SMS is a critical enterprise application for desktop configuration management. Customers need to track its service availability. The Management Pack provides a sample Microsoft SQL Server™ script for computing SMS site system availability based on alerts that are raised in response to a down status from the SMS Site System Status Summarizer. Availability can be determined over any number of specified days. This metric can be used in support of established service level agreements (SLAs).

Limitations of the SMS 2003 Server Availability Script

The SMS 2003 Server Availability Script has the following limitations.

The displayed data is limited to the Operations Manager 2007 database grooming interval. By default, the script returns data for the last seven days, and for the last 30 days. However, the OperationsManager database has a grooming interval of 4 days, so executing the script with seven- and 30-day intervals returns the same results. If reports reflecting longer time periods are required it will be necessary to increase the grooming interval or use the SMS 2003 Server Availability Report which executes against the OperationsManagerDW database which is the long-term storage for operational data. For smaller periods of time the 7 and 30 day intervals in the script need to be changed.

Script Algorithm

Once every hour, the site system summarizer attempts to contact each site system. If it is unable to contact a site system, it lists that computer as down in the Site System Status Summarizer table. To reduce the number of false reports for this condition, the Management Pack consolidates events and raises an alert only if there are two events in a two-hour period. When you run the stored procedure, it calculates the number of “Site System is possibly down” alerts that are raised during the specified number of days, subtracts the downtime from the total time available, and returns a percentage of availability for that period. If a computer has not logged any “Site System is possibly down” alerts during the specified interval, its calculated availability is 100%.

You do not need to run the script more frequently than every seven days, unless you want to see a daily trend.

By default, the script returns data for the last seven days and for the last 30 days. By specifying a longer data interval, a more accurate estimate of the server uptime is provided. Do not set the intervals to a time period longer than the Operations Manager 2007 database grooming cycle.

The following examples show how the server availability goes up as the data is averaged over a longer time period:

  • If there are two events generated for a computer within two hours, one alert is generated. If one alert is generated within the last seven days, the calculations are as follows:

    • Total number of hours in seven days = 7 * 24 = 168

    • Total number of hours down = 1 * 2 = 2

    • Total number of hours up = (168 - 2) = 166

    • Percentage availability = (166 / 168) * 100 = 98.809%

  • If one alert is generated within the last 30 days, the calculations are as follows:

    • Total number of hours in 30 days = 30 * 24 = 720

    • Total number of hours down = 1 * 2 = 2

    • Total number of hours up = (720 - 2) = 718

    • Percentage availability = (718 / 720) * 100 = 99.72%

  • If one alert is generated within the last 84 days, the calculations are as follows:

    • Total number of hours in 84 days = 84 * 24 = 2016

    • Total number of hours down = 1 * 2 = 2

    • Total number of hours up = (2016 - 2) = 2014

    • Percentage availability = (2014 / 2016) * 100 = 99.9%

The following examples show how the server availability changes as more alerts are logged over the same time period:

  • If two alerts are generated within the last 30 days, the calculations are as follows:

    • Total number of hours in 30 days = 30 * 24 = 720

    • Total number of hours down = 2 * 2 = 4

    • Total number of hours up = (720 - 4) = 716

    • Percentage availability = (716 / 720) * 100 = 99.44%

  • If three alerts are generated within the last 30 days, the calculations are as follows:

    • Total number of hours in 30 days = 30 * 24 = 720

    • Total number of hours down = 3 * 2 = 6

    • Total number of hours up = (720 - 6) = 714

    • Percentage availability = (714 / 720) * 100 = 99.16%

The polling interval of the Site System Status Summarizer is fixed at one hour and cannot be changed through the SMS Administrator console. Modifying this interval does not improve the accuracy of this script. Changing the event consolidation interval is possible, but not recommended because it does not improve the accuracy of the result. In fact, reducing the consolidation interval makes the results less reliable because the two-hour consolidation interval reduces false results due to a momentary network outage.

See Also

Concepts

Additional Resources for Microsoft System Management Server 2003 Management Pack
Sites and Components for SMS2003 Management Pack
Computer Groups for SMS 2003
Rule Group for SMS 2003
Providers for SMS 2003
Scripts for SMS 2003
Views for Microsoft SMS 2003
Diagrams for SMS 2003
Reports for SMS 2003