Part 4 - Best Practices

By Brian Wren, Senior Consultant, Microsoft Consulting Services (Southern California)

This article is the final of a four-part series on writing scripts in Microsoft Operations Manager (MOM). The theme of the series is to compare scripts in MOM to those written for Windows Script Host in order to leverage the large amount of knowledge and materials already focused on scripts for Windows Script Host.

Part 1 - The Basics

Introduces the concepts behind MOM scripts compared to those in Windows Script Host and the common and different objects used by each. You will learn how to get output data from a script into the MOM workflow.

Part 2 - Getting Data into a Script

Focuses on getting data into a MOM script. This includes working with parameters and retrieving information from the MOM object that launches the script.

Part 3 - Writing and Debugging

Insight into the logistics of writing and debugging a script in MOM. This article covers the use of different editors and utilities for performing these functions.

Part 4 - Best Practices

Discussion of best practices and answers to general questions, such as: When should you generate an alert as opposed to an event with a MOM script? How complex should a MOM script be before you break it into multiple parts? What about security?

The first three parts of this series provided an overview to scripting in MOM, focusing on the core information to get you started. This last part will provide some best practices around using the methods that we have already discussed and will also discuss advanced topics that just didn’t fit anywhere else the series.

On This Page

Create Events or Alerts
Small Scripts or Big Ones
State Variables
Where Should the Script Run?
Security
Timeouts and Concurrent Execution
Conclusion

Create Events or Alerts

The most common use of scripts in MOM is to identify an issue on an agent computer and generate an alert. At a high level there are two strategies for performing this function: you can issue an alert directly from the script using ScriptContext.CreateAlert, or you can have the script generate an event with ScriptContext.CreateEvent and rely on a rule to detect that event and generate an alert.

While it may seem like the more complex approach, I recommend that you have the script generate an event. Back in the MOM 2000 days, this was an easy recommendation since alert suppression did not work on alerts generated from a script. MOM 2005 has eliminated this limitation, but for several reasons the recommendation still stands.

The first reason is that you are limiting the amount of logic being performed in the script. As I will argue in the next topic on keeping your scripts small and targeted, the script should be no more complex than necessary. It should simply perform any required actions, collect any required data, and then return what it found without making any attempt to interpret that data.

When you generate an alert, you are interpreting the results of collected data. This includes determining whether the information is important enough to warrant the generation of an alert in the first place and then what severity level the alert should be assigned. Interpretation of data is generally performed by processing rules in MOM.

The best strategy is to have your script create an event with a unique event code for each type of output that it may generate. Then create a processing rule to look for each event code and generate an appropriate alert. If that logic changes, any MOM Author can change it without having to modify the script. It’s much easier to modify a rule than a script, and most companies have limited resources for scripting.

The other reason for this strategy is that it makes your scripts more flexible. For example, in Part 1 we presented a script that collects the size of a file as performance data. Such a script might be created in response to a requirement to detect whether a particular file exceeded a given size. This would often be implemented by checking the value of the file size against a specified size (which might be accepted as a parameter) and then immediately generating an alert if the size were exceeded. The problem with this implementation is that it would lock the script into that specific scenario. If implemented to create events rather than to generate alerts, the simple script that we wrote could be for multiple scenarios: tracking the size of a file, reporting if a file exceeded a particular size, or even reporting if a file is less than a certain size. Heck - with a little more code, we could even have it report that the file is missing. That one script could be leveraged for a variety of scenarios, as opposed to creating a new script for subtly different requirements. This concept is illustrated in Figure 1.

Generate Alert

Figure 1 - Recommended Method to Generate Alert from a Script

Small Scripts or Big Ones

There is a tendency to continually add to a script as requirements are extended. When you do this, it doesn’t take long before you end up with multiple activities being performed in a single long and complex script. There’s nothing particularly wrong with a long script - you’re probably never going to hit the MOM limit for script size (not in MOM 2005 anyway). Continuing with the message emphasized in the previous topic, the more concerning factor is the flexibility of that script in MOM.

Keep in mind that we are allowed to have multiple responses to a single MOM rule, and these responses will be executed in succession. Rather than a single long script, you could have multiple short scripts that execute one after the other. The advantages to this strategy are that the scripts are easier to maintain and are more flexible for use with other rules.

In Part 2, we wrote a script that would retrieve a user account and then disable that account. I mentioned that it would be useful to extend that script to logoff the current user as well. Rather than extend the existing script, it would more effective to write a separate script that simply logs off the current user, then launch that as a second response to the same rule.

The following script, found in the Windows 2000 Scripting Guide, shuts down the current computer:

Const SHUTDOWN = 1
strComputer = "."
Set objWMIService = GetObject("winmgmts: {(Shutdown)}" _
 & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
Set colOperatingSystems = objWMIService.ExecQuery _
 ("SELECT * FROM Win32_OperatingSystem")
For Each objOperatingSystem in colOperatingSystems
 ObjOperatingSystem.Win32Shutdown(SHUTDOWN)
Next

We can easily modify this script to logoff the current user. First we need to change the constant that is passed to the Win32Shutdown method to do a logoff instead of a shutdown. Also, we don’t need to specify the shutdown privilege on the GetObject moniker since we are only doing a logoff. These slight modifications result in the following:

Const SHUTDOWN = 1
Const FORCE_LOGOFF = 4
strComputer = "."
Set objWMIService = GetObject("winmgmts:" _
 & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
Set colOperatingSystems = objWMIService.ExecQuery _
 ("SELECT * FROM Win32_OperatingSystem")
For Each objOperatingSystem in colOperatingSystems
 ObjOperatingSystem.Win32Shutdown(FORCE_LOGOFF)
Next

Rather than paste this code at the end of our Disable a User script, let’s save it as a separate script and then launch it as a second response. There may be other scenarios where we might want to force a logoff of the current user, and this strategy allows the script to be reused in those scenarios.

Figure 2 shows the two scripts being launched from a single rule.

Multiple Scripts from One Rule

Figure 2 - Calling Multiple Scripts from One Rule

State Variables

One challenge you might run into if you follow the previous section’s recommendation for keeping scripts small is passing data between scripts. If you use multiple scripts together with a single processing rule, the second script may need data generated by the first. In that case you might be tempted to merge the scripts together into a single script so you can freely pass variables around. It’s not as if one script can access the variables from another script…or can it?

Scripts in MOM have a limited life. When a script completes execution, all of its variables are cleared from memory, and other than any MOM objects that the script created (events, alerts, and performance data) all indications that the script executed disappear. Even if the same script executes again at a later time, it has no ready means of retrieving information from its previous execution. There are scenarios where it would be nice to store one or more values that the script can access on its next execution or that might be available to other scripts. Fortunately, there is a way to do this: use state variables.

Whereas regular variables in a script are cleared from memory as soon as the script completes execution, state variables are stored in memory for the life of an agent. State variables can be created and have their value changed either from a rule response or from a script, but only a script can actually use a state variable value.

State Variables

Figure 3 -State Variables

Let’s illustrate the core concepts and use of state variables using our Terminate a Process script from Part 2. Suppose you wanted to track how many times a particular process had been terminated. We can store the number of times that the process was terminated in a state variable and increment it each time the script runs. That way, we can report that number in our event and potentially take different action based on the running count.

State variables come in variable sets. The first step in creating a variable set is to create a Script State object using ScriptContext.GetScriptState. Next, create the variable set with GetSet (which gets an existing variable set or creates it if it doesn’t already exist) and then use the Get and Put methods to access each variable within the set. I’ll demonstrate each of these steps for our example scenario.

The following code will create a variable set for our example. I’ll use the name of the script for the name of the variable set in order to clearly distinguish it from sets used by other scripts.

strVarSetName = "Terminate A Process"
Set objScriptState = ScriptContext.GetScriptState
Set objVarSet = objScriptState.GetSet(strVarSetName)

Once we have a variable set object (objVarSet in our example), we can use the Get and Put methods to use state variables within the set. In this example, we want to track the number of times that we have detected and terminated a specified process. It would be a good idea to create a variable name based on the name of the process. We modified the script in Part 2 to allow the process name to be specified in a script parameter. By using that process name as the variable name, we can ensure that the script maintains a distinct count for each process the script may be used for. There is no need to worry about actually creating the variable. If we try to Get it before it has been created, it will return a null value which will act as zero in a mathematical equation. When we Put to it the first time, it will be created.

The following code shows how we can use our variable set object to create a new state variable using the process name and then incrementing it by one. Remember that strProcessName holds the name of the process that we can use for creation of the state variable..

intCount = objVarSet.Get(strProcessName)
intCount = intCount + 1
objVarSet.Put strProcessName,intCount

Keep in mind that this state variable will retain its value for the life of the agent. If the computer is rebooted or the MOM agent is restarted, all state variables are cleared. In that case, our count would reset to zero.

Let’s put this new code into our Terminate a Process script. For the purposes of our example, we’ll simply use the count value in our event message. We could, of course, use it like any other variable to launch alternative code or even store it as performance data.

Const EVENT_TYPE_SUCCESS = 0
Const EVENT_TYPE_ERROR   = 1
Const EVENT_TYPE_WARNING = 2
Const EVENT_TYPE_INFORMATION = 4
Const EVENT_TYPE_AUDITSUCCESS = 8
Const EVENT_TYPE_AUDITFAILURE = 16

strProcessName = ScriptContext.Parameters.Get("ProcessName")
bolGenerateEvent = CBool(ScriptContext.Parameters.Get("GenerateEvent"))

strVarSetName = "Terminate A Process"
Set objScriptState = ScriptContext.GetScriptState
Set objVarSet = objScriptState.GetSet(strVarSetName)

strComputer = "."
Set objWMIService = GetObject("winmgmts:" _
    & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")

Set colProcessList = objWMIService.ExecQuery _
    ("Select * from Win32_Process Where Name = '" & strProcessName & "'")

For Each objProcess in colProcessList
        objProcess.Terminate()
Next

intCount = objVarSet.Get(strProcessName)
intCount = intCount + 1
objVarSet.Put strProcessName,intCount

If bolGenerateEvent = True Then
        CreateEvent 100,EVENT_TYPE_INFORMATION, "Process Monitoring", _
"Terminated " & colProcessList.Count & " instances of process " _
 & strProcessName & "." & vbcrlf _
& "This process has been detected and terminated " _
& intCount & " number of times since the last restart. "
End If

Sub CreateEvent(intEventNumber,intEventType,strEventSource,strEventMessage)
        Set objEvent = ScriptContext.CreateEvent()
        objEvent.EventNumber = intEventNumber
        objEvent.EventType = intEventType 
        objEvent.EventSource = strEventSource
        objEvent.Message = strEventMessage
        ScriptContext.Submit objEvent
End Sub

This script would be executed identically to the method described in Part 2. The only difference with this one is that it will keep a running count of how many times the script has been executed for each process.

This example illustrates a script accessing its own variable from previous script executions. There is no difference in the code for an entirely different script to access this script’s variables. The only requirement is that the other script know the name of the variable and its containing state variable set.

As valuable as state variables are, don’t get too carried away with them. If you want to simply issue an alert in response to a particular event occurring a number of times within a certain time interval, then you should consider using a Consolidation Event Rule. This will be simpler and more reliable, and it places the logic in a rule instead of a script, which follows my previous recommendation.

The other factor to consider is that state variables will exist only for the life of the agent. If you have data that needs to exist beyond that life, then you should determine an alternate storage location such as the registry or a text file - basically the same options you would have with a script using Windows Script Host. You can, of course, place values in MOM objects such as events and performance data, but those objects really aren’t accessible to subsequent scripts running on agents. You can use the SDK to access MOM objects from a script, but this will need to access a Management Server in order to locate the appropriate object and will typically not be as efficient as a simple storage solution on the local agent.

Where Should the Script Run?

When a MOM rule executes a script, that rule must specify when that script will be executed on the agent itself or on the management server. (See Figure 4.) Which one should you pick? Like most answers in the world of computing - it depends.

Run Location

Figure 4 - Script Run Location

Choosing to run a script on the agent computer is certainly the most common scenario. One of the advantages of having a locally installed agent is that we can execute local actions. This gives us the ability to execute under the LocalSystem account as opposed to requiring the authority for a centralized service to access a variety of remote servers. This is also far more scalable than remotely accessing the managed computer since the Management Server doesn’t have to execute scripts against potentially hundreds of clients. The savings on bandwidth should be obvious.

There may be circumstances though where you want a script to execute on the Management Server. For example, maybe you want a script to execute some action against an agent computer from another computer. Rather than simply testing locally, you can leverage the Management Server to connect to the agent computer over the network, testing the network interface in addition to application functionality.

Another example of when you might want to execute a script on the Management Server is when you need a response to a certain incident to initiate the execution of a script against another server or network device. For example, maybe an alert is generated on a server that indicates a potential problem on an external device. We might have a script that can perform additional analysis but will not be able to access this device without domain credentials. A simple solution is to launch that script in response to the alert but specify that it be executed on the Management Server, which will use the Management Server Action account. This is typically a domain account and should have sufficient privileges to perform the required operation. We will discuss security more in a moment.

Scripts on Agentless Computers

If you have an agentless computer, then all scripts executed against that computer will run on the Management Server. It should be obvious why the script won’t execute on the agent, since there is no agent in this scenario. If you are scripting with agentless computers in your environment, then there are some things you need to consider. We typically assume that a script will be executing against the local computer and will have access to all local resources. If the script is running against an agentless computer though, any local resources will belong to the Management Server where the script will be running.

You can check if a script is executing against an agentless computer with the aptly named IsTargetAgentless property of ScriptContext. It should be pretty obvious what that does - returns True if the target of the script is agentless and False if it’s not. You may decide not to allow the script to execute if the target is agentless, or you may need to use alternate code to access the target computer remotely. Here is some pseudo code representing this concept:

If Not ScriptContext.IsTargetAgentless Then
    'Execute normal code
Else
    'Execute code for remote connection or simply exit.
End If

You can also take advantage of the TargetComputer property of ScriptContext, which returns the computer name of the target computer whether it’s agentless or not. Several of the examples in this series use WMI connections to the local computer. These scripts could be easily modified to execute against an agentless computer by using the target computer instead of assuming the script is executing on the local computer. One of the advantages of WMI is that it can be executed against a remote computer as easily as the local one.

For example, our List Process Owners script from Part 1 works against the local computer:

strComputer = "."
Set objWMIService = GetObject("winmgmts:" _
    & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")

Set colProcessList = objWMIService.ExecQuery("Select * from Win32_Process")

For Each objProcess in colProcessList
    colProperties = objProcess.GetOwner(strNameOfUser,strUserDomain)
    ScriptContext.Echo "Process " & objProcess.Name & " is owned by " _ 
        & strUserDomain & "\" & strNameOfUser & "."
Next

We could modify this to work against any computer, agentless or not, by simply changing the first line as follows:

strComputer = ScriptContext.TargetComputer
Set objWMIService = GetObject("winmgmts:" _
    & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")

Set colProcessList = objWMIService.ExecQuery("Select * from Win32_Process")

For Each objProcess in colProcessList
    colProperties = objProcess.GetOwner(strNameOfUser,strUserDomain)
    ScriptContext.Echo "Process " & objProcess.Name & " is owned by " _ 
        & strUserDomain & "\" & strNameOfUser & "."
Next

Security

It’s important to understand the security context under which a script is executing. This isn’t a particularly complicated concept, but some of the most common and frustrating errors we encounter with MOM scripts are related to security. Everything looks fine when you’re running your tests with ResponseTest using the context of your user account, but things don’t go as well when you get that script into production. Detailed information on MOM security is included in the MOM 2005 Security Guide, but I’ll give a brief overview as it relates to scripting.

Scripts in MOM are run in the MOMHost process under the context of the Action Account. A MOMHost process is spawned by the MOM process when a response (including a script) is executed. The Action Account is critical since this is going to define the authority that your script is going to have. This account may be different for each agent. It is set during the agent installation and may be modified through the Administrator Console by viewing the properties of the agent, as show in Figure 5.

Agent Action Account

Figure 5 - Dialog Box Specifying Agent Action Account

The default for the Action Account on agent computers is LocalSystem, and this is typically the account that should be used. LocalSystem has full privileges on the local computer but no authority connecting to another computer. This is usually sufficient since most scripts are intended to operate against the agent computer on which the script is executing.

The general strategy for virtually all operations in MOM (including scripts) is to have the agent work against the local computer. That is the strategy that keeps MOM scalable - cutting down on network traffic and allowing a single Management Server to support many agents. There are circumstances though where it makes sense to deviate from this standard strategy, such as when a script is going to access computers outside of the local one.

For example, maybe you want a script to test the network access to a particular application. You want a script on one agent to programmatically access another to determine the application response across the network. Another common example is access to a database. In addition to performing tests against the database itself, you may need a script to log information to it.

I’m certainly not recommending that you use a domain account as your default for Agent Action Accounts. The general rule is simple: use LocalSystem for the Agent Action Account unless you have a script that requires different authority. In that case, use a domain account only for those agents that will be running that script, and provide that domain account with the minimum authority that it requires.

Keep in mind that the Action Account on Management Servers typically runs under domain credentials - often with significant authority on the network. Rather than provide domain credentials to a regular MOM agent for a specific task, you might consider launching the script back on the Management Server as discussed in the previous section.

Part 3 discussed the use of ResponseTest for testing your scripts before deploying them in MOM. Keep in mind that a script in ResponseTest is going to be executing under the context of the user account that you are logged in with when you execute the test. This will probably be your personal domain user account. In the case of MOM administrators writing scripts, this is usually an account with a wide range of access and could be deceiving. Ensure that you perform a final test of the script executing in the MOMHost process under the credentials of the appropriate Agent Action Account before you consider that script production-quality.

Timeouts and Concurrent Execution

Most of the time you can assume that a single instance of a script will execute on an agent with a relatively short execution time. It’s important, though, to understand how MOM handles situations such as scripts with long execution times and the possibility of multiple instances of the same script.

Concurrent Execution

Multiple scripts can run simultaneously on the same agent, and this includes multiple instances of the same script. For example, suppose you have a script that takes on average three minutes to execute. Maybe it’s executing a series of complex test processes in a particular application and just takes that long (or it might be a poorly written script). Now suppose you schedule that script to execute every two minutes. The script will still be executing when another instance of it is launched. In this case, both instances will be executing at the same time. This is an important consideration because you may have both instances competing for common resources. They will maintain their own local variables but could be accessing common machine resources or state variables. You typically write scripts assuming that only a single instance of it will be running, so you don’t include code to account for multiple instances.

In MOM 2000, only a single instance of a particular script could execute at one time, meaning you wouldn’t have to worry about multiple instances of a script fighting with each other. When that restriction was removed in MOM 2005, you gained flexibility at the expense of some extra responsibility. While it’s actually rare to have a script with such a long execution time coupled with such a frequent launch interval that the script has a chance to step on itself, it can happen. This potential though is counteracted by the script timeout.

Timeouts

A new feature in MOM 2005 is the timeout of a script. With MOM 2000, a misbehaved script could run indefinitely and could only be terminated by stopping the agent. In addition to taking up unnecessary resources on the agent computer, this would block subsequent executions of the script.

With MOM 2005, every rule executing a script has a script timeout value. The default for this value is five minutes, meaning that if the script is still running five minutes after it was initiated, it will automatically be terminated.

Timeout Value

Figure 6 - Script Timeout Value

You’ll rarely need to increase this value since it’s unusual to have a script that runs for more than five minutes. If you do have one still executing after this interval, chances are pretty good that it should be terminated. If you have a script with a chance of being executed in intervals smaller than five minutes, you may want to decrease this value to ensure that you don’t accidentally get multiple instances of the same script. If you have a script that may execute longer than five minutes, then you should probably review the script and break it into smaller chunks before you think about increasing the timeout value. If you simply ensure that the script timeout value is less than the frequency that a script is launched, then you ensure that you won’t have to account for multiple instances.

Miscellaneous

We’ve covered the more complex topics, but I still have a few miscellaneous recommendations to cover. Nothing too profound or complicated here, but these topics should answer common questions that I tend to receive.

Finding Rules Related to Scripts

A common question is how to locate all the rules that execute a particular script, and there is a simple answer.

In the Administrator Console, right-click on Management Packs and select Find Rules. Click Next a few times, leaving the default values in the wizard. You will end up on the Responses dialog box that will allow you to select Launch Script. Once this is selected, you will be presented with a drop-down list box listing each script in the MOM installation. Go ahead and pick the one you’re interested in, and after a couple of seconds you should have a listing of all rules using it.

Modifying Management Pack Scripts

Don’t. That was simple enough, right?

Okay, let’s provide some more discussion around this. I’m not going to back down though - you should never modify a script that was provided with a management pack. Those scripts were provided for a specific purpose and there may be other rules and scripts relying on these scripts to execute in a specific manner. In addition, any update to that management pack is probably going to overwrite that script - along with your changes.

That doesn’t mean that you aren’t allowed to change functionality if you need to, though. If you really want to change a management pack script, just copy it and change the copy. Of course if you do that, any rules that are configured to execute that script are going to execute the original version. If you want them to execute your modification, perform a search for any rules executing the original script (using the Find Rules method just discussed). From the results of that search, modify the rules to execute your modification.

Naming a Script

This seems like a pretty trivial topic since you can name a MOM script anything you want. What’s in a name, anyway?

There is no easy means in MOM 2005 to separate your custom scripts from scripts provided by Management Packs. I just recommended that you don’t modify the Management Pack scripts, so how are you supposed to tell them apart?

I always recommend to my customers that they use some common prefix to distinguish the scripts they have created on their own. That way, they can quickly identify the scripts that they wrote (and that they are allowed to modify) and have those scripts conveniently grouped together when the list of scripts is sorted by Name. You’ll notice that the sample scripts that I created for this series start with a prefix of “TechNet: Scripting MOM: “. You might want to use something shorter like your company name or abbreviation. Anything will work as long as it lets you distinguish your scripts from the ones you aren’t supposed to touch.

Conclusion

Assuming you’ve made your way through this entire four-part series, you should now have a pretty thorough understanding of writing scripts in MOM. We certainly haven’t covered every conceivable topic related to scripting in MOM, but I will argue that we’ve covered an awful lot.

The primary goal of this series was to make available to MOM users the huge amount of technical information related to general Windows Scripting. MOM provides a delivery and data collection engine that Windows Script Host simply does not have, and a script in MOM firing at regular intervals or in response to a defined event in your environment provides far more value than a script that can only be executed interactively from a command line. Armed with the ability to adapt these scripts to MOM, you can open up for yourself a complete library of sample code and leverage existing skills in providing custom management solutions.

My final recommendation would be to get out there and write some scripts. We are planning on providing sample solutions for MOM scripts that go beyond the sophistication of the samples in this series and welcome any recommendations. Please send any questions you have or suggestions for samples you would like to see to scripter@microsoft.com (in English, if possible).