Chapter 2 - Capacity and Availability Management

Article
02/20/2014

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Updated : October 11, 2001

Microsoft® Exchange 2000 Server

This chapter is part of the Exchange 2000 Server Operations Guide.

To continue to meet service level agreements (SLAs), Microsoft® Exchange 2000 Server must be appropriately configured as the load on the system increases. In particular, you need to ensure that servers running Exchange are sized correctly as the load on the system increases, and that unplanned downtime is kept below the levels defined in the SLA. In addition, when necessary, you will need to upgrade hardware to continue to meet the requirements that have been defined. This chapter looks at the capacity and availability management tasks that you should consider performing as Exchange usage increases.

Introduction

In the vast majority of cases, the load on your Microsoft® Exchange 2000 Server computers will increase over time. Companies increase in size, and as they do, the number of Exchange users increases. Existing users tend to use the messaging environment more over time, not only for traditional e-mail, but also for other collaborative purposes (for example, voicemail, fax, instant messaging, video conferencing). The load on the messaging environment will also vary over the course of the day (for example, there may be a morning peak) and could vary seasonally in response to increased business activity.

The aim of your operations team should be to minimize the effect of the increased load on your users, at all times keeping within the requirements set by your service level agreement (SLA). You will need to ensure that existing servers running Exchange are able to cope with the load placed upon them (and upgrade hardware if appropriate).

Another important requirement of the operations team is to minimize system downtime at all times. The level of downtime your organization is prepared to tolerate needs to be clearly set out in the SLA, separated into scheduled and unscheduled downtime. Many organizations can cope perfectly well with scheduled downtime, but unscheduled downtime almost always needs to be kept to a minimum.

Exchange 2000 is predominantly self-tuning, but there are areas where tuning your servers running Exchange will result in an improvement in performance. It is important to identify these areas and tune where appropriate.

Inevitably there will come a point where the load on your servers running Exchange is such that hardware upgrades are required. If you manage this process effectively, you can significantly reduce the costs associated with upgrading.

Chapter Sections

This chapter covers the following procedures:

Capacity management
Availability management
Performance tuning
Hardware upgrades

After reading this chapter, you will be familiar with the requirements for capacity and availability management in an Exchange 2000 environment and the steps necessary to ensure that the requirements of your SLA are met.

Capacity Management

Capacity management is the planning, sizing, and controlling of service capacity to ensure that the minimum performance levels specified in your SLA are exceeded. Good capacity management will ensure that you can provide IT services at a reasonable cost and still meet the levels of performance you have agreed with the client.

For more complete information on the capacity management process, visit:

https://www.microsoft.com/technet/itsolutions/cits/mo/mof/default.mspx

This section will help you meet your capacity management targets for an Exchange 2000 environment.

Of course, whether an individual server reaches its SLA targets will depend greatly upon the functions of that server. In Exchange 2000, servers can have a number of different functions, so you will need to ensure that you categorize servers according to the functions they perform and treat each category of server as an individual case. In particular, do not consider servers purely in terms of the number of mailboxes they hold.

When you are looking at the capacity of a server running Exchange, consider the following:

How many mailboxes are on the server?
What is the profile of the users? (Light, medium, or heavy use of e-mail; do they use other services, such as video-conferencing?)
How much space do users require for mailboxes?
How many public folders are on the server?
How many connectors on the server are on the server?
How many distribution lists are configured to be expanded by the server??
Is the server a front-end server?
Is the server a domain controller/Global Catalog server? (generally not recommended)

Generally, the more functions a server has, the fewer users that server will be able to support on the same hardware. To gain the maximum capacity from your servers, consider having servers dedicated to a specialized function. In many cases your planning will have resulted in specialized hardware for specialized functions, for example, in the case of front-end servers.

To ensure that you manage capacity appropriately for your Exchange 2000 server, you need a great deal of information about current and projected usage of your server running Exchange. Much of this information will come from monitoring. You will need information about patterns of usage and peak load characteristics. This information will need to be collected on a server-by-server basis, because a problem with a single server in an Exchange 2000 environment can result in a loss of performance for thousands of users. The performance of your network is also critical in ensuring delivery times and timely updating of Exchange directory information.

The main areas you should monitor to ensure that your servers running Exchange exceed your SLAs' capacity requirements include the following:

CPU utilization
Memory utilization
Hard-disk space used
Paging levels
Network utilization
Delivery time within and between routing groups
Delivery time to and from foreign e-mail systems within your organization
Delivery time to and from the Internet (although this depends greatly on minute-by-minute performance of your connection to the Internet and the availability of bandwidth to other messaging environments)
Time for directory updates to complete

You will find more information on monitoring in Chapter 4.

It is fairly common to choose the size the disks of a server running Exchange based on how many mailboxes you plan to have on the server multiplied by the maximum allowable size of each mailbox. Using this approach, however, will generally not help you to meet your SLAs. You should strongly consider approaching this problem from a different perspective. When determining the capacity of your server running Exchange, consider basing it on the time it takes to recover a server from your backup media. Recovery time is generally very important in organizations because downtime can be extremely costly. If you are using a single store on your server running Exchange, use the following procedure to help you size it.

Divide the recovery time of a database (defined in your SLA) by half. Around half the recovery time will generally be spent on data recovery, the rest on running diagnostic tools on the recovered files, database startup (which includes replaying all later message logs) and making configuration changes. Of course, this is only a general figure—the longer you leave for recovery time, the smaller the proportion of that time is required for configuration changes.
Determine in a test environment how much data can be restored in this time.
Divide this figure by the maximum mailbox size you have determined for the server (again listed in your SLA). This will give you the number of mailboxes you can put on the database.

As an example, assume that your SLA defines a recovery time of four hours for a database. In testing, your recovery solution can restore 2 gigabytes (GB) of Exchange data per hour and your maximum mailbox size is 75Mb.

Using the preceding procedure, the following calculations can be done.

The recovery time divided by 2 is 2 hours.
4Gb of Exchange data can be restored in this time
4Gb divided by 75Mb is 54 Mailboxes

In this example, if you wanted to provide more mailboxes per server, you would either have to a) alter your SLA to increase recovery time or b) find a faster restore solution.

Each server running Exchange can support up to 20 stores, spread across 4 storage groups (in Enterprise edition). If your server running Exchange is configured with multiple stores, recovery times can be more difficult to calculate. Stores in the same storage group are always recovered in series, whereas ones in separate storage groups can be recovered in parallel. You may also have created multiple stores to allow you to offer different SLAs to different categories of user (for example, you might isolate managers on one store so that you can offer them faster recovery times than the rest of the organization.) If you do have multiple stores, you will need to consider the SLA on each store and the order in which stores will be recovered to accurately determine recovery time.

As a result of the first two steps in the preceding calculations, you will have a figure for the maximum amount of data located in your information stores. Generally, you should at least double this to determine the appropriate disk capacity for the disks containing your stores. This will allow you to perform offline maintenance much more quickly as files can be quickly copied to a location on the same logical disk.

By using a key SLA to define your capacity, you are creating an environment in which you are far more likely to meet the targets you set.

Sizing servers to meet SLAs is crucial, but servers must also meet user performance expectations. Using Microsoft and third-party tools, ensure that your predicted user usage will be accommodated on the servers.

If you have sized your database according to the techniques mentioned here, you should be able to ensure that your database is kept to a manageable size. However, keeping an eye on the size of your Exchange 2000 databases is still important. In a large enterprise, it is typical for users to be moved from one server to another quite often, and for users to be deleted. This can result in significant fragmentation of databases, which results in large database sizes, even if you do keep the number of mailboxes below the levels you have determined.

To deal with this problem, you should continually monitor available disk space on your servers running Exchange. If the RAID array containing the stores gets close to half full, an alert should be sent indicating the problem, and that the Exchange Database might need to be defragmented offline. To do this, perform an alternate server restore (see Chapter 5 for details) and then defragment the database on this alternate server. If this is successful and results in a significant reduction in database size, you can perform the defragmentation at the next scheduled maintenance time.

Performing the alternate server restore also has the advantage of ensuring that your backup and restore procedures are working effectively. You should check this regularly in any case. This is also covered in more detail in Chapter 5.

Probably the most important thing to remember when performing capacity planning is to size conservatively. Doing so will minimize availability problems, and the cost reduction will generally more than compensate for any excess capacity costs.

As well as looking at technical issues, you will need to examine staffing levels when you are capacity planning. As your Exchange 2000 environment grows, you might need more people to support the increased load. In particular, if there are more users requiring increased services, there is likely to be a greater need for help desk support.

Availability Management

Availability management is the process of ensuring that any given IT service consistently and cost-effectively delivers the level of availability required by the customer. It is not just concerned with minimizing loss of service, but also with ensuring that appropriate action is taken if service is lost.

For more complete information on the availability management process, visit:

https://www.microsoft.com/technet/itsolutions/cits/mo/mof/default.mspx

One of the main aims of Exchange 2000 operations is to ensure that Exchange is available as much as possible and that both planned and unplanned interruptions to service are minimized. Availability in an organization is typically defined by your SLAs in two ways—service hours and service availability.

Service Hours

These are the hours when the Exchange services should be available. Typically, for a large organization this will be all but a very few hours a month. Defining your service hours allows you to create defined windows when offline maintenance of your servers running Exchange can be performed without breaching the terms of your SLA.

You might choose to define in the SLA the exact times when Exchange services might be unavailable. For example, you might state that Exchange services might be unavailable for four hours every first Saturday of the month. However, in large organizations it is often more practical to commit to, for example, no more than four hours of scheduled downtime per month, with a week's notice of any scheduled change. This allows changes to be made much more easily across the organization, at times when the right staff can be devoted to the tasks.

Of course, just because you have allowed for a certain amount of downtime per server per month, this does not mean that you have to use it, and in most cases you will not. On the other hand, just because you haven't performed offline maintenance one month does not mean that the hours can be carried over to the following month. Your user community will be very unhappy if you take a system down for 2 days, even if it has been up solidly for 2 years!

You might wish to define different service hours for the different services available in Exchange (mail, public folders, etc). This would depend on the amount of offline maintenance that is typically required for each service. For example, you might determine that your SMTP bridgehead servers and firewall servers never require offline maintenance and so might set the level of service hours for mail delivery significantly higher than for mailbox access. If you are prepared to spend the appropriate money on resources, it is very possible to achieve extremely low levels of scheduled downtime, and this can be reflected in your SLA.

Service Availability

Service availability is a measure of how available your Exchange services are during the service hours you have defined. In other words, it defines the levels of unscheduled downtime you can tolerate within your organization. Typically levels of availability in an SLA of an enterprise are between 99.9 and 99.999 percent. This corresponds to a downtime of as much as 525 and as few as 5 minutes per service per year.

Of course, ANY unscheduled downtime is inconvenient at best, and very costly at worst, so you need to do your best to minimize it.

To ensure high levels of availability, you need to consider two key questions:

How often, on average, is there downtime for a service?
How long does it take to recover the service if there is downtime?

Once you have considered these questions, you can set about minimizing the number of times a service fails and the time taken to recover that service.

Availability management is intrinsically linked with capacity management. If capacity is not managed properly, then overloaded servers running Exchange might fail, causing availability problems. A classic example of this would be running out of disk space on a server running Exchange, which would result in the databases shutting down and in users losing a number of services.

Minimizing System Failures

To minimize the frequency of failure in Exchange 2000, you need do the following:

Decrease single points of failure
Increase the reliability of Exchange 2000 itself.

Decreasing Single Points of Failure

You can maintain availability in Exchange 2000, even in the event of a failure, provided you ensure that it is not a single point of failure. In some areas, such as database corruption, it is not possible to eliminate single points of failure, but in many cases you can guard against individual failures and still maintain reliability. An obvious example is the directory. By having multiple domain controllers and Global Catalog servers available in any part of your network, you maintain availability of Exchange even in the event of failure of a particular domain controller or Global Catalog server. Having local domain controllers or Global Catalog servers keeps Exchange available in the event of a non-local network failure.

Using front-end servers is another way to avoid single points of failure. The failure of a single front-end server will have no effect on the availability of Exchange to non-MAPI clients. The clients will simply be rerouted to another front-end server, with no loss of service.

Exchange 2000 routing can be modified to minimize single points of failure. In particular, you can modify Routing Group connectors to ensure that there are multiple bridgeheads available, and thus maintain delivery from one part of the organization to another. You can also set up Routing Group meshes, which consist of a series of fully interconnected Routing Groups with multiple possible routes between them.

Multiple messaging routes between servers are useless if they all rely on the same network connections and the network goes down. You should therefore ensure that there are multiple network paths (using differing technologies) that Exchange and Windows 2000 can use.

One of the most significant single points of failure is a mailbox server. This can affect very large numbers of users, depending on the server. Mailbox servers can be clustered to ensure their continued high availability. If you are running Exchange 2000 on Windows 2000 Advanced Server, you can cluster over two nodes and you have two possible ways to cluster the servers—active/passive and active/active. Active/passive clustering is the current recommended clustering implementation for Exchange. If you choose to implement active/active clustering, you should realize that it requires careful planning to ensure that Exchange can fail over correctly to the other node. With Service Pack 1 of Exchange 2000 and Windows 2000 Datacenter server, you can have four nodes in your cluster. In this implementation consider active/active/active/passive clustering.

In a standard clustered environment, however, the disk array is still the single point of failure, so you should think seriously about using a storage area network (SAN) to maximize the availability of all your servers running Exchange.

If you are creating truly redundant Exchange 2000 servers, you shouldn't stop at the disk subsystem. Your servers should be equipped with redundant RAID controllers, network interface cards (NICs), and power supplies. In fact, you should aim to have redundancy everywhere.

Single points of failure can also be created by improper maintenance of systems. For example, if you are using a RAID 5 array on a server running Exchange with a hot spare, the disk subsystem becomes a single point of failure once that hot spare is invoked. If you have robust systems in place, you must ensure that any failures are resolved promptly. Make sure that you have notification and monitoring procedures in place and a system for resolving problems.

Remember that Exchange relies on Active Directory and Global Catalog servers to function. If no domain controllers are available to a server running Exchange, stores will dismount. If Global Catalog servers are unavailable, Exchange clients will not function (as they require a Global Catalog server to access the Global Address List). You should minimize single points of failure on these servers as much as possible, or at least ensure that you have redundant servers in every location.

Finally, do not forget non-computing issues. You can have the most robust e-mail system in the world and then find that it falls apart due to a fire in a building, a power failure, or theft of server hardware or data. You should take precautions against all these possibilities. This would include ensuring that you have taken the following into account:

Good physical security
Protection from fire
Protection from flooding
Concealed power switches
Air conditioning
UPS systems
Alternate power generation

You will need to make sure that all of these services are in place and that you have defined a drill to deal with their failure. For example, you should ensure that there are personnel on call for all emergency systems.

You might also wish to house your servers running Exchange in separate locations from one another to help reduce the impact of such events.

For more information about protecting yourself from these problems, check out the MOF Web site:

https://www.microsoft.com/technet/itsolutions/cits/mo/mof/default.mspx

Good availability management is intrinsically linked with good change and configuration management. If you manage change and configuration well, you are well positioned to have good availability of your servers running Exchange. You will learn more about change and configuration management with Exchange 2000 in Chapter 3.

Increasing the Reliability of Exchange 2000

While Exchange 2000 is a very robust messaging system, like any product, there are configurations that in particular cases could result in a loss of reliability. In an Enterprise environment, it is important to guard against these difficulties by continually monitoring Exchange. For more detail on monitoring, see Chapter 4.

One area where you can guard against problems is database errors. Database errors can be caused by a number of factors, but they are typically hardware related. You will be able to minimize these by doing the following:

Ensure that your hardware is on the Hardware Compatibility List
Checking Event Viewer for database-related errors
Periodically running the Information Store Integrity Checker (isinteg.exe) on the database to check for errors

Part of your maintenance program should also include routinely searching the Microsoft Web site (https://www.microsoft.com/exchange) for any issues that need to be resolved by patches and/or service packs. The patches and service packs will be tested and recorded as part of your change-management program, which is covered in more detail in Chapter 3.

Minimizing System Recovery Time

To recover from failure in an Exchange 2000 environment as quickly as possible, you need to be thoroughly prepared. You will need the following:

Available hardware
Complete configuration information
A recent, working backup
An effective disaster-recovery procedure
Fast access to support resources
Staff availability to perform the restore

System recovery is covered in more detail in Chapter 5.

Performance Tuning

When you tune for performance, you are aiming to reduce your system's transaction response time. Performance tuning can take a number of forms, including the following:

Balancing workloads between servers
Balancing disk traffic on individual servers
Using memory efficiently on servers
Upgrading hardware

The most effective factor in improving performance comes from upgrading the hardware on your servers that are running Exchange. However, regardless of the hardware, there are a number of software changes that you can make to maximize the efficiency of Exchange 2000.

While obtaining the best performance from your Exchange 2000 computers is always an important goal, it is crucial to be cautious in your tuning changes. You should track all alterations in case you make a change that inadvertently reduces performance. Making one change at a time makes it easy to identify which change needs to be reversed.

Customer variation is probably the greatest variable in tuning Exchange for optimum performance. Even customers with similar needs often choose solutions that differ significantly. Hardware varies in the number and speed of the processors, the available RAM, the number of disks, and the disk configuration chosen (RAID level).

Exchange 2000 can be configured with different numbers of storage groups and databases, and can be clustered with other servers accessing a central storage subsystem. The server load will vary based on the total number of users with mailboxes on the server, the number of users logged on at a given time, the actions they are performing, and any additional load imposed by the routing of outside messages through the server.

Whenever you are doing performance tuning, you should consider the cost of extensive analysis versus the benefits you expect to get from the tuning. Put simply, if you need to analyze an individual server extensively to gain a 5 percent performance gain, it is probably not worth it, since you could easily spend a fraction of the money on buying better hardware. Not only that, but in some cases performance tuning might become ineffective as the load on the server increases, meaning further analysis might be required after a change has been made. For this reason, this guide does not cover extensive performance analysis; instead it concentrates on the performance tuning changes that are easy to identify. This usually involves modifying settings in the registry.

Making Changes to the Registry

Before you edit the registry, make sure that you understand how to restore it if a problem occurs. For information on how to do this, view the "Restoring the Registry" Help topic in Registry Editor (Regedit.exe) or the "Restoring a Registry Key" Help topic in Regedt32.exe.

Warning: Using Registry Editor incorrectly can cause serious problems that might require you to reinstall your operating system. Microsoft cannot guarantee that problems resulting from the incorrect use of Registry Editor can be solved. Use Registry Editor at your own risk.

If you are unfamiliar with the registry and how to change it, consult an expert. Even if you are very familiar with the registry, you should always carefully document the changes that you make, and monitor your system after each change.

For information about how to edit the registry, view the "Change Keys and Values" Help topic in Registry Editor (Regedit.exe) or the "Add and Delete Information in the Registry" and "Edit Registry Information" Help topics in Regedt32.exe. Note that you should back up the registry before you edit it. If you are running the Microsoft Windows NT® or Microsoft Windows® 2000 operating system, you should also update your emergency repair disk (ERD).

No Performance Optimizer

The Performance Optimizer (also known as PerfWiz) is an Exchange 5.5 tool that enables you to specify how an Exchange 5.5 computer is to be configured—for example as a private store or a public store server. In addition, you can limit an Exchange 5.5 server's memory usage and specify how many users it would be expected to handle. Based on your choices, files written by the various Exchange components (information store, message transfer agent [MTA], and so forth) can be assigned to specific fixed disks, depending upon available storage.

Exchange 2000 does not use a tool like Performance Optimizer. One reason for this is that the new release of Exchange is better capable of performing certain tasks as they are required, such as dynamically changing certain parameters, spinning up more threads, and so on. To go beyond these dynamic changes, the administrator must manually optimize disk utilization and manually modify registry keys, but fewer such registry changes are necessary.

Optimizable Features

Features in Exchange 2000 that can be optimized include:

Disks
Message transfer agent (MTA)
Simple Mail Transfer Protocol (SMTP)
Information-store database
Extensible storage engine (ESE) cache and log buffers
Active Directory™ service connector
Active Directory integration
Installable file system (IFS) handle cache, credentials cache, and mailbox cache, DSAccess cache and DSProxy
Post Office Protocol v3 (POP3) and Internet Mail Access Protocol (IMAP) settings
Outlook Web Access (OWA)

Tuning Considerations

The efficiency and capacity of Microsoft Exchange 2000 depends on the administrator's choices of server and storage hardware, and on the installation's topology. These should be chosen based on expected types and levels of usage. Exchange can be made more efficient through changes to various registry settings on the Exchange computer.

There are three main types of tuning parameters:

Those with fixed optimal values (or values that can be treated as such)
Those that can be dynamically tuned by the software
Those that must be manually tuned (using setup, the exchange system manager, the registry, and the Active Directory Services Interface Edit tool ADSIEdit)

Some parameters may need to be manually tuned for the following reasons:

Hardware or Exchange configuration information may be needed and this information cannot or will not be obtained dynamically.
Server load information may also be required; this cannot be obtained dynamically either.

Upgrading from Exchange 5.5 to Exchange 2000

When Exchange 5.5 is upgraded to Exchange 2000, some registry keys altered by PerfWiz retain their PerfWiz values, some do not, and some keys no longer appear in the registry or they appear in a different location. This means that there might be significant differences between an Exchange 2000 Server that has been upgraded from Exchange 5.5 and one that is a new installation, installed on new hardware. Because there are significant performance improvements with Exchange 2000, and because optimizations do not necessarily transfer from Exchange 5.5, it is best to start from scratch in evaluating the optimization of Exchange 2000. It is useful in this process to know your Exchange 5.5 settings before upgrading to Exchange 2000. The text file WINNT\System32\perfopt.log provides a record of those registry keys and disk assignments changed by PerfWiz.

Tuning the Message Transfer Agent (MTA)

As mentioned earlier, Exchange 2000 does not include a Performance Optimization wizard, mainly because the majority of Exchange 2000 components are self-tuning. However, when the MTA is installed, its tuning state reflects that of an Exchange 5.5 computer that has never been performance optimized.

In scenarios where an organization only has servers running Exchange 2000, the MTA does not perform any processing, and so does not need to be performance tuned. However, when your servers co-exist with X.400-based messaging systems and other foreign systems (such as Lotus cc:Mail, Lotus Notes, Novell GroupWise, and Microsoft Mail) the MTA might be used heavily and you should consider tuning the MTA registry parameters. You will also need to tune the MTA if there is substantial co-existence with Exchange 5.5 servers. These areas are beyond the scope of the Exchange 2000 Operations Guide. If you need to tune your Exchange 2000 MTA, consult the deployment section of the Exchange 2000 Server Upgrade Series, available on the following Web site:

https://www.microsoft.com/technet/prodtechnol/exchange/2000/deploy/upgrademigrate/series/default.mspx

Tuning SMTP Transport

When messages arrive into Exchange 2000 through the SMTP protocol, the data is written to disk in the form of an NTFS file (.EML). By default, these files are written to a directory (drive:\Program Files\Exchsrvr\mailroot) on the same disk partition as the Exchange 2000 binaries.

Mailroot Directory Location

Under certain scenarios, such as configuring a bridgehead server, a positive performance effect can result if the SMTP mailroot directory is located on the fastest disk partition on the computer. If you determine that the mailroot directory is not on the most optimal disk partition, you can relocate the folder by following these steps:

Note: If you are performing this procedure on an Exchange 2000 Server active/passive cluster, perform the following steps on the node that has the Exchange 2000 group online.

From your computer, log on to the domain using an account with enterprise admin permissions.
Install the support tools from the \Support\Tools folder on the Windows 2000 CD-ROM onto your computer (this does not need to be installed on the computer running Exchange 2000).
Shut down the Microsoft Exchange System Attendant and World Wide Web publishing service on the Exchange 2000 computer that you want to change.

Important: Explore the installation drive for the data store and make a backup copy of the Exchsrvr\mailroot directory (the default location for this directory is \Program Files\Exchsrvr\mailroot)

Note: If you perform the following step on a clustered Exchange 2000 server, you will need to first start Cluster Administrator and set the Exchange group to offline.
Move the VSI 1 directory (and all subfolders and content) under Exchsrvr\mailroot to the desired location.

Note: Do not move the actual mailroot directory itself.
Click Start, point to Programs, point to Windows 2000 Support Tools, point to Tools, and then select ADSI Edit.
Expand the Configuration Container Naming Context of Active Directory.
Navigate to the following path: Configuration Container\ CN=Configuration, CN=Services, CN=Microsoft Exchange, CN*=<organization*>,

CN=Administrative Groups, CN*=<admin group*>, CN=Servers, CN*=<server*>, CN=Protocols, CN=SMTP, CN=1.
Right-click the CN=1 object, and then choose Properties.
Select Both from the Select which properties to view drop-down list.

Adjust the paths of the following attributes to the appropriate subdirectories under the VSI 1 directory.
- msExchSmtpBadMailDirectory
- msExchSmtpPickupDirectory
- msExchSmtpQueueDirectory
  
  After editing each attribute, click Set.
Click OK.
Wait for Active Directory replication to replicate these changes to the rest of your forest (or at least the domain controller or Global Catalog servers that your Exchange 2000 computer is referencing).
Start the Microsoft Exchange System Attendant service. This will copy changed paths from the Active Directory into the metabase. In less than one minute after initialization, you should notice three 1005 application events (Source: MSExchangeMU, Category: General) indicating that the paths in the metabase were updated successfully.
Restart the Exchange 2000 computer.

SMTP File Handles

When the Exchange 2000 SMTP stack receives a new message, it writes the contents to a file on an NTFS partition. While the message is being processed (that is, waiting for the next hop or delivery point) a file handle is held open by the operating system. By default, SMTP is constrained to a maximum of 1,000 open file handles. This restriction is put in place to prevent out-of-memory problems in kernel memory and to ensure that the SMTP service shuts down in a relatively short period of time (upon shutdown, all buffers have to be flushed and all file handles released).

On servers with large amounts of memory (over 1 GB), you can raise the SMTP handle threshold. Each message that is open (being processed) holds a handle and uses 5 kilobytes (KB) of kernel memory and 10 KB of memory inside the INETINFO process. When you raise the threshold, more messages can be open, which enables SMTP to process a large queue at a faster rate. However, if the total number of messages in the SMTP queues is less than 1,000, this adjustment will not improve performance. Therefore, raise the value only if your server is heavily loaded and you consistently see large queues.

If you increase this value, you should decrease the maximum installable file system (IFS) handles value to avoid running out of kernel memory when there is a large queue. When your server becomes low on kernel memory, your system becomes unresponsive. To regain control of your server, you must restart it to free up the kernel memory.

Table 2.1 shows the registry parameters you might need to alter if you are to make performance gains on servers with more than 1 GB RAM.

Table 2.1 Registry Parameters to Alter for Large Servers

Location	Parameter	Default Setting	When to Change	Recommended Setting
HKEY_LOCAL _MACHINE\System\Current ControlSet\Services\SMTP SVC\Queuing	MsgHandleThreshold (REG_DWORD)	Not present, but defaults to 0x3e8	To gain additional performance when message queues are consistently greater than 1,000.	Enough to accommodate the total number of messages in the queues at any one time. You should not raise the value to greater than 15,000 decimal.
HKEY_LOCAL _MACHINE\System\Current ControlSet\Services\SMTP SVC\Queuing	MsgHandleAsyncThreshold (REG_DWORD)	Not present, but defaults to 0x3e8	To gain additional performance when message queues are consistently greater than 1,000.	Set to the same value as "MsgHandleThreshold."
HKEY_LOCAL _MACHINE\System\Current ControlSet\Services\Inetinfo\Parameters	FileCacheMaxHandles (REG_DWORD)	Not present, but defaults to 0x320	If the "MsgHandleThreshold" registry parameter value is increased from defaults.	0x258 (600)

Note: After adjusting the registry parameters shown here, you must restart the Exchange 2000 computer.

SMTP Max Objects

MaxMessageObjects is a registry parameter that correlates to the number of messages that can be queued up at a given time by SMTP. Each e-mail message resident in the SMTP queue uses at least 4 KB of memory; therefore, it is possible to run into a low memory situation with a very large queue. Setting MaxMessageObjects lower reduces the maximum number of messages that can reside in the queue, thus decreasing the maximum memory footprint for SMTP. After this limit is reached, each SMTP connection made to the server will return with an out-of-memory error. For example, if MaxMessageObjects is reduced to 10,000, SMTP will refuse any inbound mail after the queue reaches 10,000 messages.

You may need to alter the following registry entry if the Exchange 2000 computer is running out of memory because the number of incoming messages is too great for the server to process:

Location. HKEY_LOCAL_MACHINE \Software \Microsoft \Exchange \Mailmsg
Parameter. MaxMessageObjects (REG_DWORD)
Default setting. Not present but defaults to 0x186a0

Store and ESE Tuning

Due to advances in the store process, the Exchange information store requires very little manual tuning. However, better performance can be achieved with good implementation and configuration.

Online Store Maintenance

The store requires periodic online maintenance to be run against each database. By default, each database is set to run online maintenance between the times of 1:00 A.M. and 5:00 A.M. Online maintenance performs a variety of tasks necessary to keep the store in good health. These include, but are not limited to, the following:

Checking Active Directory to determine if there are any deleted mailboxes.
Removing any messages and mailboxes that are older than the configured retention policy.
Performing online defragmentation of the data within the database.

All of the operations performed by online maintenance have specific performance costs and should be understood in detail before implementing an online maintenance strategy.

Active Directory Checking

This consists of a Windows 2000 Active Directory service lookup for each user in the database. The more users you have in each database, the more Active Directory searches (using Lightweight Directory Access Protocol [LDAP]) will be made. These searches are used to keep the store in sync with any Active Directory changes (specifically to check for deleted mailboxes). The performance cost of this task is negligible on the server running Exchange but can be significant for domain controllers, depending on the number of users, number of databases, and the online maintenance times of each database. In a corporate scenario, the online maintenance occurs during the night (by default) when very few users are logged on, so the load on the Active Directory servers should be very low. The extra domain controller load created by online maintenance should not be a problem in this scenario.

If Exchange 2000 is installed in a global data center, serving customers from multiple time zones, the default online maintenance time could become an issue. The effect that online maintenance has on Active Directory is proportional to the number of users in each of the server's databases. A check for a deleted mailbox is performed against each user in a database. Thus, if you have 10,000 users in a database, it will perform 10,000 LDAP searches against Active Directory at the beginning of that database's online maintenance. If Active Directory servers are under moderate load at all times, it is necessary to stagger the online maintenance (set each database to start maintenance at a different time on the server-configuration object). This is especially critical if you have hundreds of thousands of users spread across dozens of servers and hundreds of databases.

Message Deletion and Online Defragmentation

These are very disk-intensive tasks and only affect the server on which the maintenance is being run. During this portion of online maintenance, the server might be perceived by users as sluggish if many databases are set to perform online maintenance at the same time. Again, in corporate scenarios this would occur at night where the server can easily handle the extra load. In a global data center, it might be a good idea to stagger the databases (in respect to each other on a single server) to spread the disk-intensive tasks over a greater amount of time.

Defragmenting the database consists of 18 separate tasks. After a task has started, it must complete fully before the process exits. Therefore, online maintenance can run over the time window. The next task will execute during the next online maintenance window. Depending upon the run window and the backup schedule, it might take a number of days before a full defragmentation completes.

Online Backups

Online backups complicate online maintenance even further. Backing up an Exchange 2000 database halts the maintenance of any database within that storage group, although it will restart if the backup is finished before the maintenance interval has been passed. If you have two databases in a single storage group and one is running online maintenance, the online defragmentation on the database that is running online maintenance will stop when a backup is started against either database.

It is critical that the backup time for any database within a storage group does not conflict with the maintenance times of any database within the same storage group. If it does, backup will terminate the online defragmentation portion of the store online maintenance and the database might never finish defragmenting.

Choosing the Correct Maintenance Strategy

The correct online maintenance strategy can be devised by examining the typical behavior of the user community; by knowing how many users, databases, and servers are in the site; and by coordinating this information with the online backup strategy.

Here is an example of an online store maintenance schedule for a corporate Exchange 2000 mailbox server that hosts users who are in a single time zone:

First Storage Group

Database One

Online maintenance runs between 9:00 P.M. and 1:00 A.M.

Database Two

Online maintenance runs between 9:30 P.M. and 1:30 A.M.

Database Three

Online maintenance runs between 10:00 P.M. and 2:00 A.M.

Online Backup begins at 2:00 A.M.—it backs up all databases in the first storage group when all of them have finished online maintenance.

Second Storage Group

Database Four

Online maintenance runs between 10:30 P.M. and 2:30 A.M.

Database Five

Online maintenance runs between 11:00 P.M. and 3:00 A.M.

Database Six

Online maintenance runs between 11:30 P.M. and 3:30 A.M.

Online Backup begins at 3:30 A.M and backs up all databases in the second storage group when all databases have finished online maintenance. This configuration staggers the Active Directory LDAP queries generated by online maintenance, which are performed in the first minutes of the procedure, and ensures that online backup does not interfere with online defragmentation.

Extensible Storage Engine (ESE) Heaps

When Exchange 2000 is installed on servers with more than four processors, you might notice high virtual memory usage by the Extensible Storage Engine (ESE) multi-heap. This can lead to performance problems, especially when the server has more than one GB of memory, and many databases and storage groups have been configured. It is recommended that you add the following registry parameter to all servers with more than four processors:

Location:	HKEY_LOCAL_MACHINE \SOFTWARE \Microsoft \ESE98 \Global \OS \Memory
Parameter:	MPHeap parallelism (REG_SZ)
Default setting:	(Doesn't exist) = Parallelism set to four times the number of processors installed.
Recommended setting:	For up to four processors, take no action. If the server has four or more processors and has the maximum number of storage groups and MDBs, set the value to "0." When set to zero, the parallelism is set to three plus the number of processors on the computer. For example, on eight-processor computers, it is recommended that this registry key be set to "11."

Note: You must restart the Exchange information store process after the preceding registry parameter has been changed.

Store-Database Cache Size

Exchange 2000 is configured with a hard-coded maximum store-database cache size. This default value is 900 MB. On servers with more than 2 GB of memory, it can be beneficial to increase the size of this cache. Due to virtual address space limitations, this value should never be set higher than 1200 MB.

Note: The 900-MB limit is in place to ensure that the store process always has ample virtual address (memory) from which to allocate. Increasing this value too much can lead to system instability. For more information regarding virtual address space, see Knowledge Base article 266096 available at:

https://support.microsoft.com/default.aspx?scid=kb;en-us;266096&sd=tech

Factors which affect the virtual address-space size in the Store.exe process include the following:

Initial allocation on start-up
Number of storage groups and databases on the server
Number of threads running
Size of the store-database cache

Prior to increasing the maximum cache size, it is recommended that you use the Windows 2000 performance monitor to monitor the memory of the server under normal load. You should monitor the following:

Performance Object: Process
Counter: Virtual Bytes
Instance: STORE

This will give you an accurate value for the virtual address space that the store has allocated. On a server with the /3GB setting in the boot.ini, this should be below 2.8 GB. On a server without the /3GB setting in the boot.ini, this should be below 1.8 GB (it is recommended that servers with 1 GB or more of memory have the /3GB switch added to boot.ini).

If you see values that are higher than this for either configuration, do not increase the size of your max cache size. If you see values that are lower than this for either configuration, you can safely increase the size of your database max cache size. That is, if you have a /3 GB configured server and the performance monitor shows the virtual bytes count at 2.5 GB under heavy load, then you know you are safe to increase your max cache size by 300 MB from the default 900 MB or to 1,200 MB total.

To modify the store-database cache size, you will need to use the ADSI Edit tool, which is included with Windows 2000 Support Tools.

To start ADSI Edit, click Start, point to Programs, point to Windows 2000 Support Tools, point to Tools, and then select ADSI Edit.
Expand the Configuration Container Naming Context of your Active Directory.
Navigate to the following path: Configuration Container | CN=Configuration, CN=Services, CN=Microsoft Exchange, CN=<organization>, CN=Administrative Groups, CN=<Admin Group>, CN=Servers, CN=<server>, CN=InformationStore
Right-click the Information Store object and then select Properties.
Select Both from the Select which properties to view drop-down list.
Select the msExchESEParamCacheSizeMax attribute and adjust the value. Although no value will be present, the default is 230400 (which is 900 MB). The recommended maximum for this value is 307200 (which is 1,200 MB).

Note: Be careful when setting this value, because it is very easy to make a mistake and set the msExchESEParamCacheSizeMin attribute instead.
Click Set after changing the Edit Attribute field for the attribute and then click OK.
Close the ADSI Edit tool by closing the MMC console application.
Wait for Active Directory replication to replicate this new value throughout the forest (this might take some time—using ADSIEdit elsewhere in the organization will show you how replication is proceeding).
Restart the Microsoft Exchange information store service on the Exchange 2000 computer.

Log Buffers

ESE uses a set of log buffers to hold information in memory before writing to the transaction logs. For back-end servers, the default value is too low. This can cause excessive disk I/Os to the transaction log drive. A significant performance improvement will be seen when the server is under load or when users are sending large messages. The default value is 84; this should be increased to 9,000 on all back-end servers.

The process for setting log buffers is very similar to increasing the store-database cache size (as detailed earlier). You will need to use ADSI Edit to navigate to the storage group object and then find the msExchESEParamLogBuffers attribute. The value is an integer and it should be set manually to 9,000.

Tuning Active Directory Integration

When there are numerous computers running Exchange 2000 in a Windows 2000 site, a very large LDAP load can be put on the Active Directory servers. An Active Directory server, by default, is configured to support a maximum of 20 active LDAP queries. If this limit is reached, Active Directory will return the error LDAP_ADMIN_LIMIT_EXCEEDED and will refuse to process further LDAP queries until the active number drops below 20. Twenty is generally sufficient for most Active Directory servers, but it is necessary to increase this value when you are running Exchange 2000 on six- or eight-processor servers, or if the preceding error message is logged.

The maximum LDAP queries can be configured through the MaxActiveQueries attribute. This can be adjusted using the NTDSUTIL.EXE tool. Increasing this setting will use a little more memory in the LSASS.EXE process on the Active Directory server, so do not increase this value any more than is necessary. The following steps show how you would increase MaxActiveQueries to 40:

After opening the command prompt window, type NTDSUTIL.
Type LDAP POLICIES and press Enter.
Type CONNECTIONS and press Enter.
Type CONNECT TO SERVER <domain controller/Global Catalog server name> and press Enter.
Type Q and press Enter.
Type SHOW VALUES and press Enter.
Type SET MAXACTIVEQUERIES TO 40 and press Enter.
Type COMMIT CHANGES and press Enter.
Type SHOW VALUES and press Enter.
Verify that the new setting is shown.
Type Q and press Enter.
Type Q and press Enter.

Note: This setting will be replicated to all Active Directory servers within the forest. You do not have to restart domain controllers or Global Catalog servers for this to take effect.

Under normal circumstances, Exchange 2000 will access Global Catalog servers and the user partition of domain controllers by consulting a dynamically created list of available servers. While this is fine in the majority of circumstances, in some environments you might want to change the behavior to maintain performance. For example, you might have some underspecified domain controllers on your network, and you might wish to prevent these from being used. Or you might have very slow links in your environment and you might want to prevent servers running Exchange elsewhere in the domain from using these Global Catalog servers or domain controllers.

To ensure that particular domain controllers or Global Catalog servers are used to service requests, you can statically configure DSAccess to channel directory service loads to a specified set of directory-service servers.

You should think carefully before deciding to statically define these entries, as you generally run a greater risk of losing directory access entirely. DSAccess does not check to see if the names you specify are valid, so if you spell server names wrong in the registry, you can end up with a loss of service. Also, if static entries have been defined, then Exchange will not check dynamic entries, even if the static ones are invalid.

The following entries can be added to statically define domain controllers and Global Catalog servers:

HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \MSExchangeDSAccess \Profiles \Default \UserDC1 (UserDC2, and so on)

IsGC = REG_DWORD 0x0

HostName = REG_SZ DC_DomainName.CompanyName.com
PortNumber = REG_DWORD (0x185 by default or 0x27C for SSL)

HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \MSExchangeDSAccess \Profiles \Default \UserGC1 (UserGC2 and so on)

IsGC = REG_DWORD 0x1

HostName = REG_SZ GC_DomainName.CompanyName.com
PortNumber = REG_DWORD (0xCC4 by default or 0xCC5 for SSL)

Note: domain controller entries are defined independently of Global Catalog server entries, so it is conceivable that a static list would be used to locate Global Catalog servers, whereas a dynamic list would be used to find domain controllers.

Exchange 2000 stores and reads some information in the configuration partition of Active Directory. You might want to define which domain controllers Exchange 2000 should use when accessing the configuration partition. This is not a particularly dangerous setting, because it is a preference, not a requirement. If your favored domain controller is not available, it will switch to another on the list of available domain controllers (chosen by the method indicated previously).

To set your preferred configuration partition domain controller in the registry, alter the following:

HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \MSExchangeDSAccess \Instance0

ConfigDCHostName = REG_SZ configDC_DomainName.CompanyName.com

ConfigDCPortNumber = REG_DWORD (0x185 by default or 0x27C for SSL)

For more information on how Exchange 2000 selects domain controllers and Global Catalog servers, see the Knowledge Base article 250570 available at:

https://support.microsoft.com/default.aspx?scid=kb;en-us;250570&sd=tech

Tuning Outlook Web Access (OWA)

In environments where many users will use Outlook Web Access as their client, traffic and resources on the Exchange 2000 server—as well as latencies apparent to the user—can all be reduced by increasing the time until expiration of static files from the \Program Files\Exchsrvr\Exchweb\Controls and \Img directories. These static files include all the .gif icons, navigation or tool bars, and windows for display of messages, calendar items, and so forth.

By default, these files are marked as expiring in one day when they are received from the server by the browser; thus, once each day, each user will request and receive these files from the server. Except in cases where static content is modified to include system messages or advertisements, these files will not change except possibly in the event of a future Exchange Service Pack installation. Increasing the time before expiration to a month or a year (or setting the content to not expire) might make more sense. In such a case, these objects might only be expired out of the client's browser cache if the cache became full and older items were deleted. In addition, each time the user accesses Outlook Web Access, the browser will send an "is modified" request for each static file cached; therefore, even if a future Service Pack does modify \Exchweb content, the browser will immediately pick up the changes.

Changing this setting does not involve the registry. Instead, you will need to perform the following steps:

Start the Internet Services Manager. Click Start, point to Programs, point to Administrative Tools, and select Internet Services Manager.
Expand the server icon.

Note: If you are performing this procedure on an Exchange 2000 Server cluster, expand the Exchange Virtual Server instead of the Default Web Site in the following step.
Expand the Default Web Site.
Expand Exchweb, and view Properties on the \Controls and \img directories.
Under the HTTP Headers tab, you can either clear the Enable Content Expiration checkbox or select the Expire after property to a value greater than the one day default.

Note: If you ever need to force Internet Explorer to refresh the entire content of a page, thus ignoring data in the cache, open the URL, then hold down CTRL and press F5. In Netscape Navigator, hold down SHIFT and click Reload.

Hardware Upgrades

Exactly when hardware upgrades are required depends on the results of your capacity planning. If you plan your capacity well, you will be able to predict when hardware upgrades are required, which is particularly important when there are long lead times on hardware. Failing to predict when new hardware will be required can lead to severe availability management problems, for example if a server running Exchange were to fail because you ran out of disk space.

It can be very beneficial to perform hardware upgrades on a regular basis, standardizing on new hardware each time. This allows you to keep your hardware consistent across each Exchange 2000 role and therefore reduce support costs for the new environment. Always ensure that you test Exchange and Windows 2000 thoroughly on the new hardware to ensure that there are no unforeseen anomalies.

In many cases hardware upgrades coincide with software changes or with consolidation of servers. If this is the case, you must ensure that the new hardware is able to cope adequately with the new environment and with any intermediary changes you need to make to get to your final environment. As with all areas, ensure that you plan and document thoroughly any changes that you make to the hardware environment. If you are rolling out hardware changes across the entire organization over a short period, this can be a very labor intensive period for your operations department, so ensure that you plan carefully for the appropriate skills to be assigned.

Having a clustered environment is the best way to perform an upgrade while simultaneously minimizing downtime. Under these circumstances, you can perform a rolling upgrade—manually failing over the system to node A, performing an upgrade on Node B, failing it back over again to Node B, performing an upgrade on Node A, and then returning the system to normal.

You will also need to ensure that there is provision in your SLAs for hardware upgrades. Hardware upgrades often involve periods of scheduled downtime, especially if you do not have clusters everywhere, and you need to allow for this when you define your SLAs. With proper planning, both scheduled and unscheduled downtime can be kept to a bare minimum.

Summary

This chapter has shown you how to manage capacity and availability in your Exchange 2000 environment. You have seen performance-tuning changes that are easy to identify and implement to obtain better performance and you have learned how to examine what areas to consider when implementing hardware changed in your organization.

The Microsoft Operations Framework provides technical guidance and industry best practices that encompasses the complete IT Service Management environment, including capacity management, availability management, configuration management, service monitoring and control, service level management, and their inter-relationships.

For more information on the Microsoft Operations Framework, go to:

https://www.microsoft.com/technet/itsolutions/cits/mo/mof/default.mspx

For prescriptive MOF information on capacity management, availability management, configuration management, service monitoring and control, and service level management, please review the detailed operations guides that can be found at:

https://www.microsoft.com/technet/prodtechnol/windows2000serv/default.mspx

Chapter 2 - Capacity and Availability Management

On This Page

Introduction

Chapter Sections

Capacity Management

Availability Management

Service Hours

Service Availability

Minimizing System Failures

Minimizing System Recovery Time

Performance Tuning

Making Changes to the Registry

No Performance Optimizer

Optimizable Features

Tuning Considerations

Upgrading from Exchange 5.5 to Exchange 2000

Tuning the Message Transfer Agent (MTA)

Tuning SMTP Transport

Store and ESE Tuning

Tuning Active Directory Integration

Tuning Outlook Web Access (OWA)

Hardware Upgrades

Summary

Additional resources

Chapter 2 - Capacity and Availability Management

On This Page

Introduction

Chapter Sections

Capacity Management

Availability Management

Service Hours

Service Availability

Minimizing System Failures

Minimizing System Recovery Time

Performance Tuning

Making Changes to the Registry

No Performance Optimizer

Optimizable Features

Tuning Considerations

Upgrading from Exchange 5.5 to Exchange 2000

Tuning the Message Transfer Agent (MTA)

Tuning SMTP Transport

Store and ESE Tuning

Tuning Active Directory Integration

Tuning Outlook Web Access (OWA)

Hardware Upgrades

Summary

Related Topics

Additional resources