MOM 2000 SP1 - Performance and Sizing

Article
02/20/2014

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Event and Performance Management for Windows®-based Systems

Microsoft Corporation

September 2003

Click here to download a copy of this paper and the Management Server Sizer

Abstract

This technical paper describes a process for testing Microsoft® Operations Manager 2000 (MOM) Service Pack 1 (SP1) and recommends a suitable computer system size with enough reserve capacity to monitor a specific number of managed computers. It also provides information about the expected performance of that computer system while managing these computers.

Prefatory Note
Introduction
MOM SP1 Test Parameters
MOM SP1 Test Results
MOM SP1 Test Results I: Small Configuration - Single DDCAM (MOM Database and DCAM)
MOM/SQL Server Disk Requirements
Database and Data Workload Sizing
Microsoft Operations Manager/SQL Server Test Results
Best Practice: Capacity/Performance Recommendation
MOM SP1 Test Results II: Large Configuration - Separate Database Server and Single DCAM
The MOM/SQL Server Disk Requirements
Database and Data Workload Sizing
Microsoft Operations Manager/SQL Server Test Results
Best Practice: Capacity/Performance Recommendation
MOM SP1 Test Results III: Enterprise Configuration - Separate Database Server and Two DCAMs
The MOM/SQL Server Disk Requirements
Database and Data Workload Sizing
Microsoft Operations Manager/SQL Server Test Results
Best Practice: Capacity/Performance Recommendation
MOM SP1 Management Packs
Appendix A: Test Results For Microsoft Operations Manager 2000 RTM
Appendix B: MOM SP1 Management Sizer
Appendix C: SQL Server Installation for Microsoft Operations Manager Usage
Appendix D: Counter Definitions

Prefatory Note

All tests referred to in this report were designed to determine the minimum computer hardware required for a management server to perform various Microsoft Operations Manager 2000 (MOM) tasks. The MOM test team conducted these tests in June 2003 using MOM Service Pack 1 (MOM SP1).

Note: The MOM test team originally conducted tests in June 2001 using Microsoft Operations Manager 2000 RTM version (MOM RTM). With MOM SP1, data is processed to the MOM database differently; therefore, it is not possible to make direct comparisons of the test results. For the results of the original MOM RTM tests, see Appendix A: Test Results For Microsoft Operations Manager 2000 RTM later in this paper.

The computer systems described herein might not necessarily represent the ideal configuration. The intent is to provide a starting point from which to specify the management server, with the knowledge that the base system you are specifying has been tested and found to be able to perform a given level of tasks.

This report in no way represents or is meant to define an absolute system configuration for any number of managed computers. Instead, this report is meant to show findings and a possible starting point for you to specify the management server. Calculators are provided in the appendices to help you to calculate the database size, and to show the expected input/output activity that might be found on a management server. For more information, refer to the appendices later in this paper.

Testing took into account all events, alerts, and performance counters that occurred during the peak operation of managed computers. This testing did not take into account any Application Management Packs that you might place into service, or the use of any services or scripts to correct certain situations. Although Management Packs were not used in the testing, processing rules were used to generate events, alerts, and performance counters per day at level that is higher than any reported by MOM enterprise customers.

For the MOM SP1 testing, the test workload was determined by collecting data from approximately 20 enterprise customers and from the Microsoft Operations and Technologies Group (OTG). The data collected showed a significant drop in the number and rates of events, alerts, and performance counters for MOM SP1. This is due to the tuning of Management Packs and increased database efficiencies. The rate of simulated network line and database usage the MOM test team used to test MOM SP1 far exceeded the actual rates collected from any of the external or internal users. For MOM SP1, actual managed computers were used to generate the network line and database usage workloads, rather than being simulated as was the case with the original MOM RTM testing.

Introduction

MOM is a management system for monitoring managed computers in an organization. MOM SP1 has been tested and it has been proven, based on many customers, that MOM SP1 scales to the published supported numbers. However, the limits for MOM can vary depending on many variables that are discussed in this paper. This paper describes the testing process used to recommend a suitable management server size with enough reserve capacity to smoothly manage a specific number of computers without putting the MOM SP1 computer systems at risk. It also provides information about the expected performance of the MOM SP1 computer system while managing these computers. Specifically, this paper answers questions such as:

How large must the management server be in terms of hardware resources?
How large is the overall footprint of MOM SP1?
How large should the MOM SP1 database be?
What are the system requirements needed to run MOM SP1 effectively?
What is the expected disk activity on the MOM SP1 database and database server?
What is the expected CPU usage of the MOM SP1 agent on a managed computer?

How might the recommendations contained in this paper be useful? Consider the performance and sizing considerations presented in the following scenarios.

Scenario 1

The systems managers of an online-order-entry environment decide to license MOM SP1 to manage 150 servers worldwide. They determine how large the management server computer system should be and decide to use a single computer for the task. With no experience in MOM capacity planning, it is difficult for them to determine the correct size for the management server. They order a computer system that is much too small for the job. They also learn that they need a much larger-capacity network to accommodate the MOM workload traffic. They will now lose time ordering additional system hardware to rectify this situation.

Scenario 2

The systems managers of an online-order-entry environment decide to license MOM SP1 to manage 1,000 servers worldwide. They determine how large the management server computer system should be and decide to use a series of tiered management systems (alert forwarding) for the task. Unlike in Scenario 1, this company will lose large amounts of money if the servers are not managed correctly, or if they go offline for any reason.

In such a large environment, deciding how large the first tier configuration group management servers should be, and how large the management server in the master configuration group should be compounds the complexity of the sizing considerations. Again, with limited or no experience in MOM capacity planning, it is very difficult for the system managers to determine the correct size for the tier one and tier two management servers. As a result, they order computer systems that are too small for the job. In the process, they also find that they need a separate management network to accommodate the MOM workload traffic. They will now lose time ordering the additional system hardware to rectify the problems.

Conclusion

Careful consideration of the performance and sizing of the hardware systems that support MOM SP1 is critical to the successful implementation of MOM to manage computers in your organization. Although there is no absolute system configuration for any number of managed computers, this technical paper presents the results of performance and sizing testing for MOM SP1 in environments of various sizes. You can use the findings in this paper and the MOM SP1 Management Server Sizer as a starting point to help you to determine the appropriate performance and sizing considerations for MOM in your organization. For more information about the MOM SP1 Management Server Sizer, see “Appendix B: MOM SP1 Management Sizer” later in this paper.

MOM SP1 Test Parameters

This section presents the key factors for the MOM SP1 performance and sizing testing — the hardware used, the scope and goals for the testing, the tools used, and how the test workload was calculated. Later sections present the results of testing.

Hardware Test Environment

This series of tests included three different MOM configuration scenarios, each with an appropriate range of managed computers.

Small configuration - Single DDCAM (MOM database and DCAM), with 20, 50, 85, 140, and 200 managed computers.
Large configuration - Separate database server, single DCAM, with 250, 500, 700, and 1000 managed computers.
Enterprise configuration - Separate database server; two DCAMs, with 700 and 1000 managed computers.

The detailed systems information for each of these scenarios, along with the test results, is described in later sections.

The network used in all tests had a line capacity of 100 Mbps, which represents the highest available bandwidth for most organization’s production environments. Lines with greater capacity, such as T1, are not widely used by a large part of the user community. The hardware was set up in a single-tier configuration. Multitiered configurations were not tested.

For all test scenarios, the configuration for the managed computers was the same, as described in Table 1.

Table 1 Managed Computer Configuration

System component	Description
Processor count	1
Processor type	1000 MHz Pentium 4
Memory	512 MB
Disk count	1
Disk designation OS	Drive C
Disk size	7.85 GB (6 GB free space)
Network capacity	100 Mbps (12.5 MB)

Workload Environment

The data workloads used in testing each of the configurations for MOM SP1 was consistently higher than workloads used in MOM RTM testing and higher than the actual workloads reported by the largest enterprise customers. For example, alerts delivered to the database for MOM SP1 were 0.00833 alerts per minute per computer at the 1000-managed computer level for the enterprise configuration. This compares to MOM RTM testing at the rate of 0.00445 per minute per computer, which means that the rate was twice as high for MOM SP1. Table 2 and Table 3 show the workload levels used in testing MOM SP1.

Table 2 Data Workload Levels per Day Used for Testing MOM SP1

Managed computer count	Alerts per day	Events per day	Performance counters per day
20	2,250	100,000	250,000
50	2,250	100,000	250,000
85	2,250	100,000	250,000
140	2,250	100,000	250,000
200	2,250	100,000	250,000
250	9,000	400,000	400,000
500	9,000	400,000	400,000
700	9,000	400,000	400,000
1000	12,000	600,000	600,000

Note: The values in Table 2 far exceed any numbers reported by the largest enterprise customers for MOM SP1.

Table 3 Data Workload Rates per Minute per Managed Computer Used for Testing MOM SP1

Managed computer count	Alerts per minute per managed computer	Events per minute per managed computer	Performance counters per minute per managed computer
20	.078125	3.470	8.680
50	.031200	1.380	3.472
85	.018350	0.816	2.042
140	.011100	0.049	1.240
200	.007810	0.347	0.868
250	.025000	1.111	1.111
500	.012500	0.555	0.555
700	.008930	0.397	0.397
1000	.008330	0.417	0.417

Note: The values in Table 3 far exceed any numbers reported by the largest enterprise customers. The values for events and performance counters used for MOM RTM testing were higher than the MOM SP1 test scenarios. This is a result of fine-tuning the Management Packs to reduce the volume of events and performance counter traffic for MOM SP1.

Integrated Grooming

MOM SP1 uses an integrated grooming feature, which means that each time MOM SP1 performs a database insert for an event, alert, or a performance counter, it also deletes up to 4,000 records by default according to the grooming parameters that you have established. As a result, the need to periodically groom the MOM database is substantially reduced. Another result of integrated grooming is that ongoing CPU utilization and total I/Os are higher with MOM SP1 than with MOM RTM. However, when you groom the MOM database, you do not experience high CPU utilization, which often reached 100 percent for an extended period of time with MOM RTM. Table 4 shows the alert latency and grooming data for each level of managed computers tested.

Table 4 Alert Latency and Grooming Data

Managed computer count	Alert latency (seconds)	Events groomed per day	Performance counters groomed per day
20	42.32	3,960,000	21,144,000
50	47.37	3,936,000	20,760,108
85	44.66	4,656,000	19,632,108
140	49.98	3,816,000	19,632,108
200	51.27	4,752,000	21,456,108
250	83.35	2,152,110	13,174,932
500	121.18	2,280,000	13,294,932
700	119.19	2,160,000	15,190,932
1000	121.23	2,312,400	19,992,264

Note: For all tests results shown in Table 4, the duration of testing was four hours. Alerts were not groomed during this testing because they did not accumulate fast enough to require grooming.

When the workload was increased for the 250 to 1000 managed-computers level, grooming rates dropped off. This is because the DCAM is performing more database inserts, and therefore it performs integrated grooming at a slightly lower percentage to prevent insert latency. For each insert, the DCAM uses an algorithm to calculate the level of integrated grooming, depending on a number of factors, such as how many inserts are in the queue.

Scope of Testing

The scope of testing determines how well MOM SP1 scales, what management server configuration best manages the computers, and what the maximum number of computers is that a single management server can manage.

For each test configuration, the test procedure was as follows:

Set up the MOM database server by using the required database backup and set up the DCAM(s).
Set up the required number of managed computers for the DCAM(s). Flushed the queues if MOM agents had already been installed on the managed computers.
Stopped and restarted the OnePoint service and the MOM database services, in the following order:
- Stopped OnePoint service on the DCAM(s)
- Stopped MSSQLSERVER and SQLSERVERAGENT services on the MOM database
- Started MSSQLSERVER and SQLSERVERAGENT service on the MOM database
- Flushed the queues on the DCAM(s)
- Started the OnePoint service on the DCAM(s)
Started collecting the performance counters on the MOM database server and DCAM(s).
Started the specified alert, event, and performance counter workload on each of the managed computers, which began the test.
Set up the grooming jobs to run once each hour for the last 2 hours of the test.
Ran the test for 4 hours.

Note: MOM SP1 Build 1300 (RTM) was used for all test scenarios.

Performance Monitor Counter Metrics

Table 5 lists the primary performance counters that were collected and used for this analysis. For a complete list and description of the counter functions, see “Appendix D: Counter Definitions” later in this paper.

Table 5 Primary Counters Used in Testing

Counter object	Counter property	Instances
Processor	% Processor Time average	Total
	% Processor Time peak
	Interrupts/sec
Process	% Processor Time	OnePoint process
	Working Set	SQL Server processes
	Thread Count
	IO Read Operations/sec
	IO Write Operations/sec
Memory	Available Bytes	Total
	Page Faults/sec
	% Committed Bytes In Use
Network Interface	Bytes Total/sec	100 Mbps network adapter card
	Current Bandwidth
Physical Disk	Disk Reads/sec	Drive C
	Disk Writes/sec	Database disk drives
	Avg. Disk Queue Length
System	Processor Queue Length	Total
SQL Server:Databases	Transactions/sec	OnePoint database
SQL Server:Buffer Manager	Buffer Cache Hit Ratio	OnePoint database
Calculated counters	Counter property	Calculations
% Network Busy	Bytes Total/sec/Current Bandwidth Bytes	Bytes Total/sec = Bytes Sent/sec + Bytes Received/sec Current Bandwidth Bytes = Current Bandwidth/8
Memory Free Space	Available KBytes/Total Physical Memory

Note: Table 5 establishes the core performance-counter collection metrics. Other counters might be used for further analysis. The Physical Disk, % Disk Time counter was not used because it gives false readings on Redundant Array of Independent Disks (RAID) arrays. All the database disk arrays used for these tests were RAID 10.

MOM SP1 Test Results

The following sections show test results for a range of managed computers in different-sized MOM configurations.

MOM SP1 Test Results I: Small Configuration - Single DDCAM (MOM Database and DCAM)

The first series of tests was performed on a small management server. This system includes the basic management server that manages a few computers. This section shows the capacity of the system. The test results also show maximum capacity, in terms of the upper bounds of managed computers that can be adequately controlled by this server configuration.

Hardware Test Environment - Small Configuration, Single DDCAM

Table 6 DDCAM System Configuration

System Component	Description
Processor count	4
Processor type	550 MHz Pentium 3
Memory	768 MB
Disk count OS	1
Disk count DB	6 (RAID 10)
Disk count log file	1
Disk designation OS	C drive (8.46 GB)
Disk designation DB	D drive (101.6 GB, 37.1 GB free space)
Disk designation log file	E drive (26 GB)
Disk I/O capacity - Reads	750 read operations per second
Disk I/O capacity - Writes	375 write operations per second
Network capacity	100 Mbps (12.5 MB)
MOM Build	MOM SP1 Build 1300 (RTM)

MOM/SQL Server Disk Requirements

Table 7 shows the resources needed to install the management server, along with the SQL Server database. Microsoft® SQL Server™ 2000 Standard was used for these tests.

Table 7 Disk Requirements for Small Configuration - Single DDCAM

MOM disk space requirement total	230 MB
MOM OnePoint working set average memory	53.81 MB-79.87 MB
OnePoint threads (avg.)	71
SQL Server working set memory	604.57 MB-645.37 MB
SQL Server database size (disk space)	6.63 GB
Database log disk space	1 GB
MS DTC log size disk space	512 MB

Database and Data Workload Sizing

Table 8 shows the number of rows in the MOM database tables prior to running each test for a specific number of managed computers (from 20 to 200). Tables 9 and 10 show the data workloads used in the tests for this configuration.

Table 8 Pre-test Database Table Sizes for the Small Configuration

Rows in Alert table	88,574
Rows in Event table	2,252,322
Rows in SampledNumericData table	3,720,018

Table 9 Data Workload Levels per Day - Small Configuration (for 20 to 200 managed computers)

Alerts per day	2,250
Events per day	100,000
Performance counters per day	250,000

Note: The data workload shown in Table 9 was held constant for each level of managed computers (from 20 to 200). This workload represents higher workloads than the levels reported by any of the enterprise customers during their testing of MOM SP1 Build 1300 (RTM). The MOM SP1 testing workload values for alerts, events, and performance counters were based on the results of surveys taken by the largest enterprise customers for workload traffic, and then inflated to represent peak load situations.

Table 10 Data Workload Rates per Minute per Managed Computer Used for the Small Configuration

Managed computer count	Alerts per minute per managed computer	Events per minute per managed computer	Performance counters per minute per managed computer
20	.078125	3.470	8.680
50	.031200	1.380	3.472
85	.018350	0.816	2.042
140	.011100	0.049	1.240
200	.007810	0.347	0.868

Note: The values in Table 10 far exceed any numbers reported by the largest enterprise customers. The values for events and performance counters used for MOM RTM testing were higher than the MOM SP1 test scenarios. This is a result of fine-tuning the Management Packs to reduce the volume of events and performance counter traffic for MOM SP1.

In the original MOM RTM testing, the average alerts delivered to the database were equal to 0.00445 per minute per computer. For the MOM SP1 workload used in testing this configuration, the average alerts delivered to the database was 0.00781 per minute per computer at the 200-managed computer level. This is approximately twice as high as the MOM RTM test workload levels. At the 20-managed computer level for MOM SP1, the average alerts delivered to the database were 0.0781 alerts per minute per computer. This represents a workload over17 times higher than the MOM RTM testing workload. This means that the alert workloads used for the MOM SP1 performance testing of this configuration, range from 2 times to 17 times as high as the MOM RTM testing levels.

Microsoft Operations Manager/SQL Server Test Results

These tests were performed to find what size the management server should be, in terms of hardware, to perform a set level of work. Testing was started from 20 managed computers to find the upper limit. In this test series, the MOM DDCAM was monitored while managing 20 computers at the low end. These findings were used to establish a baseline for the DDCAM operation. Table 11 depicts the growth rate as more managed computers are added.

Table 11 Effects on DDCAM of Additional Managed Computers - Small Configuration

Managed computer count	% CPU utilization	OnePoint service utilization	OnePoint working set peak	Disk reads/sec	Disk writes/sec	Memory free space	Network busy
20	22.56%	9.00%	53,809,957	288.92	268.84	57.02%	0.38%
50	27.64%	20.22%	54,349,396	301.69	273.15	55.48%	0.38%
85	30.91%	25.91%	56,505,485	306.24	258.50	55.19%	0.38%
140	38.19%	36.76%	70,732,436	276.75	259.49	55.18%	0.39%
200	42.31%	44.01%	79,874,416	253.74	242.19	54.64%	0.40%

Figures 1 through 5 graphically present information from Table 11.

Figure 1: Adding managed computers increases CPU utilization

New for MOM SP1 – Increased Managed Computer Capacity for DDCAMs

Notice in Figure 1 that the CPU utilization on this DDCAM, which is a 4-processor 550 MHz system, varies between 22 percent utilization for 20 managed computers to 42 percent for 200 managed computers. With the new multi-gigahertz processors, you can easily manage 200 computers with a 2-processor system.

Figure 2: Increasing I/O has a negative affect on disk performance. In this case, increasing disk queues cause increased latency (see Figure 3)

For MOM SP1, there is marked increase in read and write activity over MOM RTM. In the original MOM RTM testing, the total peak I/O rate total for 200 managed computers was 116.60 per sec per computer. This is due to the increased activity caused by integrated grooming. For more information about integrated grooming, see the “Integrated Grooming” section earlier in this paper.

Figure 3: Disk queues remain approximately ten for MOM SP1

Figure 3 displays queue lengths of approximately ten. In MOM RTM testing the queue lengths were less than two. This is the result of the increased I/O activity caused by integrated grooming in MOM SP1. These disk queues could be decreased considerably by adding more disk spindles to the RAID array.

Recommendation Use the RAID Selector Section of the MOM SP1 Management Server Sizer to determine the adequate spindle counts based on the various workloads and RAID configurations that you might want to use. The RAID Selector Section of the MOM SP1 Management Server Sizer takes into account that disk queue lengths should be less than two. For more information about the Sizer, see "Appendix B: MOM SP1 Management Sizer" later in this paper.

Figure 4: Free memory space is adequate at all managed computer levels

Memory usage for the MOM DDCAM, which includes DCAM and database activity, was as high 56 percent free memory space with 768 MB of memory. In all of these tests, MOM SP1 uses approximately the same amount of memory consistently.

Figure 5: Network utilization remains very low at all managed computer levels

Network utilization has risen predictably from the 20 managed computer count to the 200 managed computers count, with a high point of 0.40 percent utilization. This is consistent with what has been seen throughout the series of testing and consistent with customer reports about network usage. This utilization factor reflects only steady-state usage and does not reflect Management Pack or MOM agent pushdowns.

Best Practice: Capacity/Performance Recommendation

Use the MOM SP1 Management Server Sizer to determine the appropriate system size and configurations based on the various workloads that you might want to use. The MOM SP1 Management Server Sizer automatically calculates the RAID selection and spindle count; expected network usage for various network bandwidths; DCAM server and MOM database server hardware sizes; and the database and log sizes. For more information about the MOM SP1 Management Server Sizer, see "Appendix B: MOM SP1 Management Sizer" later in this paper.

MOM SP1 Test Results II: Large Configuration - Separate Database Server and Single DCAM

The second series of tests was performed on a larger management server (DCAM), with the MOM database installed on a separate computer. The test results show the maximum number of managed computers that can be adequately controlled by this server configuration.

Hardware Test Environment - Large Configuration, Separate Database, Single DCAM

Table 12 Database System Configuration for the Large Configuration

System component	Description
Processor count	4
Processor type	550 MHz Pentium 3
Memory	768 MB
Disk count OS	1
Disk count DB	6 (RAID 10)
Disk count log file	1
Disk designation OS	C drive (8.46 GB)
Disk designation DB	D drive (101.6 GB, 37.1 GB free space)
Disk designation log file	E drive (26 GB)
Disk I/O capacity - Reads	750 read operations per second
Disk I/O capacity - Writes	375 write operations per second
Network capacity	100 Mbps (12.5 MB)
MOM Build	MOM SP1 Build 1300 (RTM)

Table 13 DCAM System Configuration for the Large Configuration

System component	Description
Processor count	2
Processor type	800 MHz Pentium 3
Memory	512 MB
Disk count	1
Disk designation	C drive (14.6 GB, 12.5 GB free space)
Network capacity	100 Mbps (12.5 MB)
MOM Build	MOM SP1 Build 1300 (RTM)

The MOM/SQL Server Disk Requirements

Table 14 and Table 15 show the resources needed to install the DCAM and the SQL Server database. SQL Server 2000 Standard was used for these tests.

Table 14 Database Disk Requirements for Large Configuration

Database server:
MOM disk space requirement total
SQL Server working set memory (average)
SQL Server database size (disk space)
Database log disk space
MS DTC log size disk space
DCAM:
MOM disk space requirement total
MOM OnePoint working set memory (average)
OnePoint threads (average)

Database and Data Workload Sizing

Table 15 shows the number of rows in the MOM database tables prior to running each test for a specific number of managed computers (from 250 to 1,000). Tables 16, 17 and 18 show the data workloads used in the tests for this configuration.

Table 15 Pre-test Database Table Sizes for the Large Configuration

Rows in Alert table	97,833
Rows in Event table	3,303,168
Rows in SampledNumericData table	2,947,496

Table 16 Data Workload Levels per Day - Large Configuration (for 250, 500, and 700 managed computers)

Alerts per day	9,000
Events per day	400,000
Performance counters per day	400,000

Note: The data workload shown in Table 16 was held constant for the 200, 500, and 700 levels of managed computers. This workload represents higher workloads than the levels reported by any of the enterprise customers during their testing of MOM SP1 Build 1300 (RTM). The MOM SP1 testing workload values for alerts, events, and performance counters were based on the results of surveys taken by the largest enterprise customers for workload traffic, and then inflated to represent peak load situations.

Table 17 Data Workload Levels per Day - Large Configuration (for 1,000 managed computers)

Alerts per day	12,000
Events per day	600,000
Performance counters per day	600,000

Table 18 Data Workload Rates per Minute per Managed Computer for the Large Configuration

Managed computer count	Alerts per minute per managed computer	Events per minute per managed computer	Performance counters per minute per managed computer
250	.02500	1.111	1.111
500	.01250	0.555	0.555
700	.00893	0.397	0.397
1000	.00833	0.417	0.417

Note: The values in Table 18 far exceed any numbers reported by the largest enterprise customers. The values for events and performance counters used for MOM RTM testing were higher than the MOM SP1 test scenarios. This is a result of fine-tuning the Management Packs to reduce the volume of events and performance counter traffic for MOM SP1.

In the original MOM RTM testing, the average alerts delivered to the database were equal to 0.00445 per minute per computer. For the MOM SP1 workload used in testing this configuration, the average alerts delivered to the database was 0.00833 per minute per computer at the 1,000-managed computer level. This is approximately twice as high as the MOM RTM test workload levels. At the 250-managed computer level for MOM SP1, the average alerts delivered to the database were 0.025 alerts per minute per computer. This represents a workload nearly 6 times higher than the MOM RTM testing workload. This means that the alert workloads used for the MOM SP1 performance testing of this configuration, range from 2 times to 6 times as high as the MOM RTM testing levels.

Microsoft Operations Manager/SQL Server Test Results

These tests were performed to find what size the DCAM and the database server should be, in terms of hardware, to perform a set level of work. Testing was started from 250 managed computers to find the upper limit.

In this test series, the DCAM and the database server were monitored while managing 250 computers at the low end. Tables 19 and 20 depict the effect on the DCAM and the database server, respectively, as more managed computers are added.

Table 19 Effect on DCAM of Additional Managed Computers - Large Configuration

Managed computer count	% CPU utilization	OnePoint service utilization	OnePoint working set average	Memory free space	Network busy
250	36.99%	38.91%	164,009,537	81.12%	0.07%
500	49.30%	48.51%	192,642,270	77.52%	0.16%
700	50.70%	48.57%	192,856,224	76.18%	0.19%
1,000	62.23%	54.65%	194,204,590	70.43%	0.23%

Table 20 Effect on Database Server of Additional Managed Computers - Large Configuration

Managed computer count	% CPU utilization	SQL Server service utilization	SQL Server working set peak	Disk reads/sec	Disk writes/sec	Memory free space	Network busy
250	17.96%	65.38%	703,313,169	56.42	166.86	56.59%	0.47%
500	22.88%	76.22%	709,683,340	154.39	264.32	55.02%	0.43%
700	22.28%	72.80%	714,549,239	133.46	265.90	56.03%	0.45%
1,000	24.37%	80.05%	720,452,301	187.55	292.35	54.38%	0.46%

Figures 6 through 13 graphically present information about the MOM database server and DCAM from Table 19 and Table 20.

Figure 6: Adding managed computers increases CPU utilization on the database server

Even with the 550 MHz, 4-processor system that was used in these tests, the CPU utilization for 1,000 managed computers is approximately 25 percent. With the new, more powerful multi-gigahertz processors, it is expected that this utilization would be drastically reduced.

Figure 7: Adding managed computers increases CPU utilization on the DCAM

Even with the 800 MHz, 2-processor system that was used in these tests, the CPU utilization for 1,000 managed computers is approximately 61 percent. With the new, more powerful multi-gigahertz processors, it is expected that that this utilization would be drastically reduced.

Figure 8: Increasing I/O has a negative affect on disk performance. In this case, increasing disk queues cause increased latency (see Figure 9)

Note: Information about I/O activity on the DCAM was not included because it was inconsequential and only reflects the operating system and MOM activity.

For MOM SP1, there is marked increase in read and write activity over MOM RTM. In the original MOM RTM testing, the total I/O peak rate for 1,000 managed computers was 59.57/sec/computer. This is due to the increased activity caused by integrated grooming. For more information about integrated grooming, see the “Integrated Grooming” section earlier in this paper.

Figure 9: Disk queues increase as managed computers are added

Figure 9 displays queue lengths of up to 16. In MOM RTM testing the queue lengths were less than two. This is the result of the increased I/O activity caused by integrated grooming in MOM SP1. These disk queues could be decreased considerably by adding more disk spindles to the RAID array.

Recommendation Use the RAID Selector Section of the MOM SP1 Management Server Sizer to determine the adequate spindle counts based on the various workloads and RAID configurations that you might want to use. The RAID Selector Section of the MOM SP1 Management Server Sizer takes into account that disk queue lengths should be less than two. For more information about the Sizer, see "Appendix B: MOM SP1 Management Sizer" later in this paper.

Figure 10: Free memory space is adequate at all managed computer levels

Memory usage for the MOM database server was 55 percent free memory space with 768 MB of memory, which is consistent with the enterprise configuration test results (see the MOM SP1 Test Results III: Enterprise Configuration - Separate Database Server and Two DCAMs section later in this paper). Best practices would recommend that on a MOM deployment this large, memory would be set at a minimum of 2 GB. It is projected that at the recommended 2 GB memory size, SQL Server would run more efficiently.

Figure 11: Free memory space is adequate at all managed computer levels

Memory usage for the DCAM at the 250-managed computer count was 80 percent free memory space with 512 MB of memory. As expected, and consistent with the findings overall, at the 1000-managed computers level, free space was 10 percent less at approximately 70 percent. Best practices would recommend that on a MOM deployment this large, memory would be set at a minimum of 2 GB. It is projected that at the recommended 2 GB memory size, SQL Server would run more efficiently.

Figure 12: Network utilization remains very low at all managed computer levels

As expected, and consistent with MOM RTM testing, network utilization is at a minimum for all levels. As Figure 12 demonstrates, MOM SP1 does not overburden the network. In further tests, the highest utilization seen was 9 percent utilization during an agent pushdown, demonstrating that agent pushdowns result in much higher network utilizations.

Figure 13: Network utilization remains very low at all managed computer levels

As in the case of the database server, the network utilization from the managed computers to the DCAM consistently rose from 0.10 percent, at 250-managed computer level, to 0.25 percent, at the 1000-managed computer level, as Figure 13 demonstrates. As in comments throughout this paper, the workloads were consistently higher than any reported by customer surveys.

Best Practice: Capacity/Performance Recommendation

MOM SP1 Test Results III: Enterprise Configuration - Separate Database Server and Two DCAMs

The third series of tests was performed using two large management servers (DCAMs), with the MOM database installed on a separate computer. The test results show the maximum number of managed computers that can be adequately controlled by this server configuration.

Hardware Test Environment - Enterprise Configuration, Separate Database, Two DCAMs

Table 21 Database System Configuration for the Enterprise Configuration

System component	Description
Processor count	4
Processor type	550 MHz Pentium 3
Memory	768 MB
Disk count OS	1
Disk count DB	6 (RAID 10)
Disk count log file	1
Disk designation OS	C drive (8.46 GB)
Disk designation DB	D drive (101.6 GB, 37.1 GB free space)
Disk designation log file	E drive (26 GB)
Disk I/O capacity - Reads	750 read operations per second
Disk I/O capacity - Writes	375 write operations per second
Network capacity	100 Mbps (12.5 MB)
MOM Build	MOM SP1 Build 1300 (RTM)

Table 22 DCAM System Configuration for the Enterprise Configuration

System component	Description
Processor count	2
Processor type	800 MHz Pentium 3
Memory	512 MB
Disk count	1
Disk designation	C drive (14.6 GB, 12.5 GB free space)
Network capacity	100 Mbps (12.5 MB)
MOM Build	MOM SP1 Build 1300 (RTM)

The MOM/SQL Server Disk Requirements

Table 23 shows the resources needed to install the SQL Server database. SQL Server 2000 Standard was used for these tests.

Table 23 Disk Requirements for Enterprise Configuration - Separate Database, Two DCAMs

Database server:
MOM disk space requirement total
SQL Server working set memory (average)
SQL Server database size (disk space)
Database log disk space
MS DTC log size disk space
Each DCAM:
MOM disk space requirement total
MOM OnePoint working set memory (average)
OnePoint threads (average)

Database and Data Workload Sizing

Table 24 shows the number of rows in the MOM database tables prior to running each test for a specific number of managed computers (from 700 to 1,000). Table 25 and Table 26 show the data workloads used in the tests for this configuration.

Table 24 Pre-test Database Table Sizes for the Enterprise Configuration

Rows in Alert table	97,833
Rows in Event table	3,303,168
Rows in SampledNumericData table	2,947,496

Table 25 Data Workload Levels per Day for the Enterprise Configuration

Workload item	For 700 managed computers	For 1,0000 managed computers
Alerts per day	9,000	12,000
Events per day	400,000	600,000
Performance counters per day	400,000	600,000

Note: The data workload shown in Table 25 represents higher workloads than the levels reported by any enterprise customers during their testing of MOM SP1 Build 1300 (RTM). The MOM SP1 testing workload values for alerts, events, and performance counters were based on the results of surveys taken from the largest enterprise customers for workload traffic, and then inflated to represent peak load situations.

Table 26 Data Workload Rates per Minute per Managed Computer for the Enterprise Configuration

Managed computer count	Alerts per minute per managed computer	Events per minute per managed computer	Performance counters per minute per managed computer
700	.00893	0.397	0.397
1000	.00833	0.417	0.417

Note: The values in Table 26 far exceed any numbers reported by the largest enterprise customers. The values for events and performance counters used for MOM RTM testing were higher than the MOM SP1 test scenarios. This is a result of fine-tuning the Management Packs to reduce the volume of events and performance counter traffic for MOM SP1.

In the original MOM RTM testing, the average alerts delivered to the database were equal to 0.00445 per minute per computer. For the MOM SP1 workload used in testing this configuration, the average alerts delivered to the database was 0.00833 per minute per computer, at the 1,000-managed computer level, and 0.00893 per minute per computer, at the 700-managed computer level. Both are approximately twice as high as the MOM RTM test workload levels. This means that the alerts workloads used for the MOM SP1 performance testing of this configuration, are twice as high as the MOM RTM testing levels.

Microsoft Operations Manager/SQL Server Test Results

These tests were performed to find what size that the DCAM and the database server should be, in terms of hardware, to perform a set level of work. Testing was started from 700 managed computers to find the upper limit.

In this series of tests, two DCAMs and the database server were monitored while managing 700 computers at the low end and 1,000 computers at the high end. The first test was conducted with 200 managed computers on one DCAM and 500 managed computers on the other, for a total of 700. The second test was conducted with 500 managed computers on each DCAM, for a total of 1,000. Table 27 depicts the effects on the two DCAMs for these test scenarios. Table 28 depicts the effect on the database server for the two test scenarios.

Table 27 Effect on the DCAMs of Additional Managed Computers - Enterprise Configuration

DCAM	Managed computer count	% CPU utilization	OnePoint service utilization	OnePoint working set average	Memory free space	Network busy
A	200/700	34.76%	38.22%	141,628,304	80.70%	0.26%
B	500/700	49.32%	53.20%	189,064,637	77.22%	0.26%
C	500/1,000	49.45%	49.98%	193,576,489	76.61%	0.25%
D	500/1,000	50.57%	51.86%	194,378,502	76.57%	0.25%

Note: Table 27 reflects the usage of four different DCAMs. In one test case, DCAM A managed 200 out of the 700 computers and DCAM B managed 500 of the 700 computers. In the second test case, DCAM C managed 500 of the 1,000 computers, and DCAM D managed 500 of 1,000 computers.

Table 28 Effect on Database Server of Additional Managed Computers - Enterprise Configuration

Managed computer count	% CPU utilization	SQL Server service utilization	SQL Server working set peak	Disk reads/sec	Disk writes/sec	Memory free space	Network busy
700	19.76%	69.67%	711,511,735	94.91	204.91	56.19%	0.46%
1,000	24.50%	76.10%	743,612,153	190.44	218.82	54.35%	0.45%

Figures 14 through 21 graphically present information about the DCAMs and the MOM database server from Table 27 and Table 28.

Figure 14: expected, adding managed computers increases CPU utilization on the database server

Even with the 550 MHz processor that was used in these tests, the CPU utilization for 1,000 managed computers is approximately 25 percent. It is projected that the utilization for 2,000 managed computers would be less the 50 percent on a 768 MHz processor. With the new more powerful multi-gigahertz processors, we expect that that this utilization would be drastically reduced.

Figure 15: Adding managed computers increases CPU utilization on the DCAMs

Figure 15 reflects the usage of four different DCAMs. For more information, see Table 27. Notice that DCAM B, DCAM C, and DCAM D, which were all managing 500 computers, had almost identical CPU utilization factors. These tests reflect the consistency of MOM DCAMs. Also, note that the utilization for DCAM B, DCAM C, and DCAM D was at 50 percent on an 800 MHz computer, which is well below the 75 percent level, leaving 25 percent reserve capacity. It is expected that on the new more powerful multi-gigahertz processors, that the CPU utilization would be drastically reduced.

Figure 16: Increasing I/O affects disk performance

Note: Information about I/O activity on the DCAM was not included because it was inconsequential and only reflects the operating system and MOM activity.

For MOM SP1, there is marked increase in read and write activity over MOM RTM. In the original MOM RTM testing, the total peak I/O rate for 1,000 managed computers was 59.57/sec/computer. This is due to the increased activity caused by integrated grooming. For more information about integrated grooming, see the “Integrated Grooming” section earlier in this paper.

Figure 17: Disk queues increase as managed computers are added

Figure 17 displays queue lengths of up to 15 on the database server. The queue length on the DCAMs was zero for all test cases, so no figure is shown. In MOM RTM testing the queue lengths were less than two. This is the result of the increased I/O activity caused by integrated grooming in MOM SP1. These disk queues could be decreased considerably by adding more disk spindles to the RAID array.

Recommendation Use the RAID Selector Section of the MOM SP1 Management Server Sizer to determine the adequate spindle counts based on the various workloads and RAID configurations that you might want to use. The RAID Selector Section of the MOM SP1 Management Server Sizer takes into account that disk queue lengths should be less than two. For more information about the Sizer, see "Appendix B: MOM SP1 Management Sizer" later in this paper.

Figure 18: Free memory space is adequate at all managed computer levels

Memory usage for the MOM database server was 60 percent free memory space with 768 MB of memory. Best practices would recommend that on a MOM deployment this large, memory would be set at a minimum of 2 GB. It is projected that at the recommended 2 GB memory size, SQL Server would run more efficiently.

Figure 19: Free memory space is adequate at all managed computer levels

Memory usage for the MOM DCAM was 80 percent free memory space with 512 MB of memory for all tests. This demonstrates efficient use of memory by MOM SP1. Best practices would recommend that on a MOM deployment this large, memory would be set at a minimum of 1 GB. It is projected that at the recommended 1 GB memory size, the DCAM would run more efficiently.

Figure 20: Network utilization remains very low at all managed computer levels

As expected, and consistent with MOM RTM testing, network utilization is at a minimum for all managed computers levels. As Figure 20 demonstrates, MOM SP1 does not overburden the network. In further tests, the highest utilization seen was 9 percent utilization during an agent push down, demonstrating that agent pushdowns will result in much higher network utilizations.

Figure 21: Network utilization remains very low at all managed computer levels

As in the case of the database server, the network utilization from the managed computers to the DCAM was consistently around 25 percent utilization. As Figure 21 demonstrates, the usage for 200 managed computers up to 500 managed computers was about the same. This is because the workload for all managed computers levels were the same. As in comments throughout this paper, the workloads were consistently higher than any reported by customer surveys.

Best Practice: Capacity/Performance Recommendation

MOM SP1 Management Packs

These tests are designed to measure the memory usage (footprint) of the MOM SP1 Management Packs both individually and cumulatively (build-up) as they are added to a managed computer.

Test Parameters

Hardware Test Environment

Table 29 describes the system configuration for the computers used in this test. The same configuration was used for the MOM SP1 server and the three managed computers.

Table 29 Computer Configuration (MOM SP1 Server and Managed Computers)

System component	Description
Processor count	1
Processor type	600 MHz Pentium 3
Memory	256 MB
Disk count	1
Disk designation OS	Drive C
Disk size	12.76 GB
Operating system	Windows 2000 Server SP3

Software Test Environment

The software and the versions used for these tests are listed in the Table 30:

Table 30 Product s and Versions Used for Tests

Product name	Version or build tested
Windows 2000 Server	Service Pack 3
SQL Server 2000	RTM + Service Pack 3
MOM 2000	Service Pack 1
MOM 2000 Application Management Pack	Service Pack 1

MOM SP1 Configuration

MOM SP1 was configured as a single configuration group, with three managed computers and with all MOM components installed on a single server.

The software that the test team installed on the MOM SP1 server is as follows:

MOM SP1
MOM SP1 Application Management Pack
SQL Server 2000 SP3
Internet Information Server
Terminal Services
Anti-virus software (eTrust)

After installing MOM SP1, the test team created three performance processing rules to capture performance data from the managed computers. Details of these custom performance processing rules are listed in the Table 31. After creating the performance processing rules, the test team created three Public views to chart this information.

Table 31 Custom Performance Processing Rules

Rule name	Provider
Performance-Private Bytes-OnePointService Agent	Process-Private Bytes-OnePointService-10-minutes
Performance-% Processor Time-OnePointService Agent	Process-% Processor Time-OnePointService-10-minutes
Process-% Processor Time-OnePointService-10-minutes	Process-Working Set-OnePointService-10-minutes

Managed Computers Configuration

The managed computers were a basic Windows 2000 Server configuration. The only additional service or product installed on the managed computers was the eTrust anti-virus software.

Agent Installation and Configuration Process

The test team installed each agent by adding the computer name to the Agent Manager, and then approving the installation of the agent to the managed computers. The installation of the agent was verified by viewing the All Agents view on the MOM SP1 server and by checking for the OnePointService process on each managed computer.

After installing each agent, the custom performance processing rules were enabled for graphing by selecting each computer in the Recent Performance view and enabling the counters for graphing.

For each test case, the managed computers were placed in several default computer groups. The computer groups that were common to each test variation are listed in Table 32.

Table 32 Common Computer Groups

Hardware Attributes – Number of Processors

Hardware Attributes – CPU Vendor

Hardware Attributes – CPU speed

Hardware Attributes – CPU Identifier

Hardware Attributes – BIOS Version

Hardware Attributes – BIOS Date

Microsoft Operations Manager Agents

When adding Management Packs to the agents, the managed computers were explicitly added to the computer groups for each Management Pack. This ensured that the Management Pack was deployed to the managed computer.

Test Cases

Management Pack Memory Build-up Tests

This series of tests is designed to measure the cumulative agent memory footprint as Management Packs are added to a managed computer. For each test case, the Management Packs were added to the agent computers in the order listed. Performance metrics were collected on the OnePointService process by using the custom performance processing rules listed in Table 31 earlier in this paper.

Test Case 1: Windows Management Pack