White Paper: Determining the Scalability of Combined Client Access and Hub Transport Server Roles in Exchange 2007

 

Bill Thompson - Technology Architect, IT Operations Excellence Team  (ITOE)

June 2009

Summary

This white paper provides information about the scalability of computers running Microsoft Exchange Server 2007 with the Client Access server role and the Hub Transport server role installed on the same server. It also provides planning guidance to help you select the appropriate hardware platform(s) for an Exchange 2007 deployment where you are deploying the Client Access and Hub Transport server roles on the same computer.

Note

To print this white paper, click Printer Friendly Version in the Web browser.

Applies To

Microsoft Exchange Server 2007 Service Pack 1 (SP1)

Microsoft Exchange Server 2007 Service Pack 2 (SP2)

Table of Contents

  • Introduction

  • Business Case for Combined Server Roles

  • Project Overview

    • Testing Topology

    • Protocol Configuration

  • Recommendations for Hardware Configuration

    • Test 1 Results and Observations

    • Test 2 Results and Observations

  • Conclusion

  • Additional Information

    • Conducting Your Own Tests

Introduction

One of the most important considerations when you're planning an Exchange 2007 deployment is to determine the configuration and number of Exchange 2007 server roles to deploy. Most Exchange 2007 server roles can coexist, and a common practice in many organizations, both large and small, is to deploy the Client Access and Hub Transport roles on the same server.

To give customers a better understanding of how these two server roles function when they are running on the same server, Microsoft Services and the Exchange Team conducted a series of tests. These tests measured server performance and scalability when these two Exchange 2007 roles were installed on the same physical server. This white paper contains the results of these tests, and based on the testing results, recommendations for optimal hardware configurations.

Return to top

Business Case for Combined Server Roles

Exchange 2007 allows for the separation of server roles onto individual computers. However, there are valid reasons why organizations may want to combine roles on a single server. Some of the primary reasons to deploy multiple roles on a single server include:

  • Organizations are under increasing pressure to reduce the number of servers dedicated to Exchange. This pressure is caused by requirements to reduce:

    • The total cost of ownership (TCO)

    • The data center footprint

    • The power consumption in the data center

    • The operational complexity and costs of administering additional servers

    • Per server licensing costs

  • Organizations want to gain redundancy. Many of the fault-tolerant features of the Client Access and Hub Transport roles require multiple computers. Combining server roles lets an organization use fault tolerance without obtaining additional hardware.

  • In branch offices or small office environments, a low user count may not justify the expense of additional hardware.

    Although you can run the Mailbox server role on the same computer that is running the Client Access and Hub Transport roles, we recommend this configuration only for very small deployments. Because this is not a recommended configuration for most deployments, this role combination was not included as part of the scalability testing project.

Return to top

Project Overview

The major goal of the Exchange role tests was to measure the scalability of a server that is running both the Client Access and Hub Transport roles. The test results can provide you with guidance about:

  • The scalability of servers that are running the Client Access and Hub Transport roles on various hardware configurations.

  • Performance counter recommendations for testing and monitoring servers that are running the Client Access and Hub Transport roles.

    Note

    Although you can run the Mailbox server role on the same computer that is running the Client Access and Hub Transport roles, we don't recommend this configuration unless the deployment is small. Because this is not a recommended configuration for most deployments, this role combination was not included as part of the scalability testing project.

The Exchange role scalability tests included two series of tests:

  • Test 1   Designed to provide basic guidance for the maximum acceptable load for a combined-role server. This is the maximum load that the combined-role server can handle while continuing to deliver acceptable response times for users.

  • Test 2    Analyzed the TCO and hardware footprint of a combined-role server and compared the results to the TCO and hardware footprint of two dedicated, single-role servers. This series of tests was performed by load testing single-role and combined-role servers that used the same user profile mix.

Note

Baseline data for the tests described in this document was based on loads generated by the internal deployment of Exchange Server 2007 at Microsoft. Because the POP3 and IMAP4 protocols don't comprise a significant client base in these baseline loads, these protocols were not included as part of the testing criteria.

Return to top

Testing Topology

The tests were performed on computers running Windows Server 2003 Enterprise Edition with SP2 and Exchange Server 2007 Enterprise Edition with SP1. Additionally, two 64-bit Windows Server 2003 SP2 global catalog servers were used. Each global catalog server had 2 processor cores and 4 gigabytes (GB) of installed RAM. The test configuration also included four Mailbox role servers. Each of these Mailbox servers had 2 processor cores and 16 GB of RAM.

To closely mimic a real-world environment, Microsoft Forefront Antivirus was installed on each Hub Transport server. Forefront Antivirus was configured to use five antivirus engines, each set to maximum certainty. All Hub Transport roles were configured to have the transport dumpster feature enabled and to use default size and retention times. Approximately 24,000 50-MB mailboxes were initialized, and each mailbox had an average message size of 50 kilobytes (KB). The mailboxes were spread evenly across 32 storage groups.

To simulate the client load, the Exchange Load Generator tool (LoadGen) was used. For more information about LoadGen, see Microsoft Exchange Load Generator.

LoadGen is the preferred tool to use in scalability testing that simulates multiple protocols and that connects to different Exchange 2007 server roles. For the Exchange role tests described in this document, the following protocols were tested:

  • Office Outlook Web Access

  • Outlook Anywhere (also known as RPC over HTTPS)

  • Exchange ActiveSync (Currently, this LoadGen module is not publicly available.)

  • Outlook 2007 in Online Mode and in Cached Exchange Mode

  • SMTP

The tests were run with load generation configured to use the Heavy profile together with SSL connections for each profile.

Each Hub Transport server used a dual-channel Serial Attached SCSI (SAS) RAID controller that was connected to a Directed Attached Storage (DAS) array. The array contained 300-GB 10,000-RPM SAS drives. Storage was configured as follows:

  • For the operating system, two local hard disks were configured in a RAID 1 array (mirrored). Each hard disk drive was a 146-GB small form factor (SFF) SAS drive.

  • For transaction log storage, two DAS hard disks were configured in a RAID 1 array.

  • For database storage, six DAS hard disks were configured in a RAID 0+1 array.

The following features and functionalities were not implemented during the tests:

  • Exchange Web Services load

  • IPsec policies

  • Transport Rules or Transport Journaling rules

  • Microsoft Operations Manager (MOM) or other monitoring agents

Return to top

Protocol Configuration

The tests all used Heavy client profiles for all client protocols. The initial protocol message mixture was based on the Exchange 2007 deployment at Microsoft. This protocol mixture was collected by monitoring loads during peak operational hours to collect a baseline. The messaging mix was as follows:

  • 150 Outlook Web Access users, generating 4.5 requests per second

  • 600 Exchange ActiveSync users, generating 8.5 requests per second

  • 15,000 Web connections from 1,300 Outlook Anywhere users

  • Approximately 12 messages per second for each Hub Transport server

Recommendations for Hardware Configuration

During the testing process, it was observed that the resources that were used by each role (Client Access and Hub Transport) complemented one another well. Typically, the load on a Client Access server is primarily memory intensive, while the load on a Hub Transport server is primarily processor and disk intensive. Therefore, these roles can coexist in many situations.

The best performance on a combined-role server occurred when the server had sufficient memory to service the Client Access components.

Note

Processor and disk resources were mostly used by the Hub Transport processes, such as the Edge Transport component and Forefront Antivirus.

When the client load was increased, the combined-role Client Access and Hub Transport servers were mainly limited by processor resources. It was easy to stress the processor resources to the acceptable limit of 75 percent. However, it was very difficult to increase the client load in such a manner as to consume all available memory resources.

Return to top

Test 1 Results and Observations

Test 1 examined how well the Client Access and Hub Transport roles coexisted on a single server. These tests demonstrated that the two roles complement one another, especially in smaller environments. Test 1 used the following hardware configurations to test server responsiveness for combined-role Client Access and Hub Transport servers:

  • 2 processor cores and 4 GB of RAM

  • 4 processor cores and 8 GB of RAM

  • 4 processor cores and 16 GB of RAM

  • 8 processor cores and 16 GB of RAM

Test 1 placed the combined-role servers under various client loads to determine how much load each hardware combination could handle while still responding to client requests. The following table summarizes the results of these tests. Then, we provide observations about each test configuration.

Test 1 results

Test characteristics 2 cores, 4 GB RAM 4 cores, 8 GB RAM 4 cores, 16 GB RAM 8 cores, 16 GB RAM

Total users

3,060

6,630

7,750

9,800

Outlook 2007 users (Online and Cached Exchange Mode)

1,500

3,000

3,000

3,000

Outlook Web Access users

300

330

450

1,500

Outlook Web Access requests per second

8

9.5

11.8

39

Exchange ActiveSync users

560

1,300

1,300

2,300

Exchange ActiveSync requests per second

1.6

6

5.2

9.5

Outlook Anywhere users

700

2,000

3,000

3,000

Processor usage (percent)

77

66

73

54

RAM usage

3,696

6,792

8,384

8,384

Available RAM

400

1,400

8,000

8,000

Messages per second

14

24

24

28.5

Average message size (KB)

45

45

60

112

Return to top

Test Configuration: 2 processor cores with 4 GB of RAM

A hardware configuration that consists of 2 processor cores together with 4 GB of RAM does not scale well above minimum levels. Although the tests show that this configuration could support up to 3,000 users, processor and memory resources are quickly taxed.

Note

This configuration was tested under a generated load of 14 messages per second (a combination of 6 SMTP messages per second and 8 MAPI messages per second). We don't recommend this hardware configuration.

The following observations were made for this configuration:

  • Processor fluctuations were very noticeable. Increases in e-mail traffic caused the server processor to spike momentarily, at times to nearly 100 percent. This behavior does not provide any headroom for unexpected events, such as weather conditions that require employees to work remotely. Events such as these could cause unexpectedly high Outlook Web Access loads.

    Note

    The optimal target load for a processor is between 60 percent and 75 percent average processor usage.

  • This is an unlikely hardware configuration for most Exchange 2007 deployments.

  • The lack of available memory did not allow sufficient room to increase a transport mail queue.

    Note

    Each e-mail object that persists in the queue uses a small amount of memory.

Return to top

Test Configuration: 4 processor cores with 8 GB of RAM

This particular hardware configuration allows more than two times the user load compared to a two-processor core-based server, without exhausting server resources. This hardware configuration matches the hardware profile of the Client Access and Hub Transport server deployments at Microsoft IT.

The following observations were made for this configuration:

  • Initial tests caused many back pressure events. After this behavior was observed, cache configuration changes were considered. For more information, see the Exchange Team Blog article New maximum database cache size guidance for Exchange 2007 Hub Transport Server role.

  • This configuration was tested under a generated load of 24 messages per second that were processed by the Transport service.

  • This hardware configuration allowed for sufficient room to increase the transport mail queue.

Test Configuration: 4 processor cores with 16 GB of RAM

Although this configuration provides ample memory to increase a transport queue, we don't recommend this hardware configuration. This is because the Client Access and Hub Transport components are unlikely to use available memory before the roles hit processor limitations.

The following observations were made for this configuration:

  • There was ample memory to increase transport queues.

  • The server became processor-bound as the load increased. The server could not use the extra memory before a processor bottleneck occurred.

Test Configuration: 8 processor cores with 16 GB of RAM

The Client Access and Hub Transport server scaled up very well with this hardware configuration. Although the tests did not include MOM agents, EWS, or many other potential consumers of memory, the tests indicated that the server could sustain the same load and still provide adequate queue growth with only 12 GB of RAM.

The data for the 8-core, 16-GB RAM server tests are presented as three separate test results:

  • Test 1a tried to reach a protocol mix that matched the ratios of the initial Microsoft IT client load.

  • Test 1b and Test 1c used a different protocol mix, with more Outlook Web Access and Exchange ActiveSync clients and fewer Outlook clients.

  • Test 1c had the same client load as Test 1b. However, Test 1c had an increased average message size, up from 46 KB to 108 KB. This increase in average message size led to the increase in processor use.

The following observations were made for this configuration:

  • There was ample available memory to increase transport queue length.

  • The processor eventually became the performance bottleneck. However, this bottleneck occurred under a much greater concurrent client load.

The following table shows the results of the three tests for this particular hardware configuration:

8-core, 16-GB RAM results

Test characteristics Test 1a Test 1b Test 1c

Total users

9,800

7,800

7,800

Outlook 2007 users*

3,000

600

600

Outlook Web App users

1,500

2,100

2,100

Outlook Web App requests per second

39

60

57

Exchange ActiveSync users

2,300

3,100

3,100

Exchange ActiveSync requests per second

9.5

13.2

11.2

Outlook Anywhere users

3,000

2,000

2,000

Processor load (percent)

54

70

90

Used memory (MB)

8,384

8,384

8,384

Available memory (MB)

8,000

8,000

8,000

Messages per second

28.5

42

38

Average message size (KB)

112

46

108

* A mix of Online Mode users and Cached Exchange Mode users.

Return to top

The following figures illustrate the user count and load on the various hardware configurations.

User count on various hardware configurations

User count on various hardware configurations

Load on various hardware configurations

Load on various hardware configurations

Return to top

Test 2 Results and Observations

Test 2 tried to reduce total cost of ownership (TCO) by loading exactly the same user profile mix from one Client Access server and one Hub Transport server onto a single combined-role server.

The first part of the testing process was designed to determine an acceptable maximum load that the servers could handle adequately. To do this, LoadGen was used to apply varying loads on the two single-role servers. After the acceptable maximum load was determined, the server hardware characteristics were doubled and configured as a single combined-role server. The same LoadGen load simulation was used to test the combined-role server.

Note

As mentioned earlier, the maximum load for the Client Access server role is constrained primarily by RAM. The Hub Transport role is constrained primarily by processor resources and hard disk drive resources. Because the load on the hard disk drives would not be affected by combining the Client Access and Hub Transport roles, testing was focused on memory and processor usage.

The following three tests were performed:

  • Test 2a - Load test the following configurations under maximum load:

    • Two 2-core servers with 4 GB of installed RAM and a single Exchange 2007 server role installed.

    • One 4-core server with 8 GB of installed RAM and the Exchange 2007 Client Access and Hub Transport server roles installed on each server.

  • Test 2b - Load test the following configurations under maximum load:

    • Two 4-core servers with 8 GB of installed RAM and a single Exchange 2007 server role installed on each server.

    • One 8-core server with 16 GB of installed RAM and the Exchange 2007 Client Access and Hub Transport server roles installed.

  • Test 2c - Load test the following configurations under maximum load:

    • One 8-core server with 12 GB of installed RAM and the Exchange 2007 Client Access and Hub Transport server roles installed.

      Note

      The goal of this test was to obtain an even greater reduction in the TCO by reducing the amount of installed RAM on the combined-role server.

  • Return to top

Test 2a

Two single-role servers with 2 cores and 4 GB of RAM were placed under the maximum acceptable load. Then, that same load was applied to a single combined-role server with 4 cores and 8 GB of RAM.

A single-role Client Access server with the simulated user role applied used 3,859 MB of available RAM. The Hub Transport server used 2,294 MB of available RAM. By adding these RAM usage numbers, it was assumed that the total amount of RAM that would be used by both roles would be approximately 6,153 MB. However, the actual RAM usage observed during the test was only 5,069 MB. After the load was applied to the server with 8 GB of RAM installed, there was still 3,123 MB of available RAM. This difference occurs because of how the operating system manages memory processes.

The following graph shows the memory usage during Test 2a.

Test 2a memory usage graph

Graph showing memory usage

The stand-alone test servers were two 2-core servers. The combined-role server used a single 4-core processor. During the test, the Client Access server reached 36 percent processor usage while the Hub Transport server reached 71 percent processor usage. Adding the results gives 107 percent. Dividing this value gives an assumed combined processor usage of 53 percent. The actual observed processor usage was 53 percent. The following chart shows the processor usage during Test 2a.

Test 2a processor usage graph

Graph showing CPU usage

The following table contains detailed information about the resources that were used during the test and about the protocol ratio. In this table, the Client Access server column does not contain Hub Transport-related values, and the Hub Transport column does not contain Client Access-related values.

Two 2-core, 4-GB server results versus one 4-core, 8-GB server results (Test 2a)

Test characteristics Single-role Client Access server Single-role Hub Transport server Total for single-role server Total for combined-role server

Total users

11,569

Not applicable

11,569

11,696

Outlook 2007 Cached Exchange Mode users

4,200

Not applicable

4,200

4,200

Outlook 2007 Online Mode users

630

Not applicable

630

630

Total Outlook users

4,830

Not applicable

4,830

4,830

Outlook Web Access users

1,619

Not applicable

1,619

1,833

Outlook Web Access requests per second

42.05

Not applicable

42.05

46.085

Exchange ActiveSync users

3,230

Not applicable

3,230

3143

Exchange ActiveSync requests per second

10.9

Not applicable

10.9

12.02

Outlook Anywhere users

1,890

Not applicable

1,890

1,890

SMTP messages per second

Not applicable

10.69

10.69

9.8

Total messages per second

Not applicable

23.735

23.735

24

Average message size (KB)

Not applicable

82,059

82,059

82,059

Disk I/O per message

Not applicable

14.94

10.69

14.10

Processor usage (percent)

36

71

106

52

Unused RAM (MB)

237

1,802

2,039

3,123

Used RAM (MB)

3,859

2,294

6,153

5,069

Return to top

Test 2b

Two single-role servers with 4 cores and 8 GB of RAM were placed under the maximum acceptable load. The same load was then applied to a single combined-role server with 8 cores and 16 GB of RAM.

The single-role Client Access server used 7,738 MB of available RAM. The Hub Transport server used 2,718 MB of RAM. Increased message traffic rates did not affect the RAM that was used by the Hub Transport server, even though the server had more installed RAM. By adding the RAM usage numbers together, it was assumed that the total amount of RAM that was used by both roles would be approximately 10,456 MB. As with Test 2a, actual memory usage was less. Only 8.9 GB of RAM was used. Therefore, 7,445 MB of RAM was still available. As stated earlier, the difference between the expected RAM usage and the actual RAM usage occurred because of operating system and related processes.

The following graph illustrates the memory usage during Test 2b.

Test 2b memory usage graph

Test 2 memory usage graph

For Test 2b, the stand-alone servers each had two 4-core processors. The Client Access server was measured at 44 percent processor usage and the Hub Transport server was measured at 76 percent processor usage. By combining these totals (120 percent) and then dividing by two, it was assumed that a combined-role processor usage of 60 percent would be observed. The actual observed processor usage was 62 percent. The following charts show the processor usage of the single-role and combined-role servers.

Test 2b processor usage graph

Test 2 CPU usage graph

The following table contains detailed information about the resources that were used during the test and about the protocol ratio. In this table, the Client Access server column does not contain Hub Transport-related values and the Hub column does not contain Client Access server related values.

Two 8-core, 8-GB server versus one 8-core, 16-GB server results (Test 2b)

Test characteristics Single-role Client Access server Single-role Hub Transport server Total for single-role servers Total for combined-role server

Total users

16,389

Not applicable

16,389

17,167

Outlook 2007 Cached Exchange Mode users

4,200

Not applicable

4,200

4,200

Outlook 2007 Online Mode users

630

Not applicable

630

630

Total Outlook users

4,830

Not applicable

4,830

4,830

Outlook Web Access users

2,606.6

Not applicable

2,606.6

3,401

Outlook Web Access requests per second

68.652

Not applicable

68.652

88.919

Exchange ActiveSync users

3,542.6

Not applicable

3,542.6

3,399

Exchange ActiveSync requests per second

22.96

Not applicable

22.96

20.815

Outlook Anywhere users

5,537

Not applicable

5,537

5,537

Web connections

48,512

Not applicable

48,512

42,410

SMTP messages per second

Not applicable

27.68

27.68

25.264

Total messages per second

Not applicable

46.824

46.824

41

Average message size (KB)

Not applicable

82,059

82,059

82,060

Disk I/O per message

44

76

120

62

Processor usage

454

5,474

5,928

7,445

Unused RAM (MB)

7,738

2,718

10,456

8,939

Used RAM

16,389 MB

Not applicable

120 percent

17,167 MB

Return to top

Test 2c

Test 2c tried to determine whether a server with 8 processor cores and 12 GB of RAM could handle the load from the combined-role server in Test 2b (8 cores, 16 GB of RAM). If this worked, the test group could further reduce the TCO. The results from Test 2b showed that there is a large amount of unused RAM. Installing only 12 GB of RAM reduces hardware costs, while still allowing a high user count.

Test 2c used the same user and protocol load from the Test 2b combined-role server. By having the same processor capabilities, the 8-core, 12-GB server successfully serviced the same user load as the 8-core, 16-GB combined-role server. Tests indicated that the amount of available RAM still allowed for additional clients, as well as room for the growth of a transport queue. To further stress the 8-core, 12-GB RAM server and to determine whether it could support an even greater load, additional tests were performed. These tests increased the user count and generated a higher number of requests per second. The following table shows the server load data using the same load characteristics as Test 2b and using an increased load.

8-core, 12-GB load results (Test 2c)

Test characteristics Test 2b client and protocol load Increased client and protocol load

Total users

16,848

18,269

Outlook 2007 Cached Exchange Mode users

4,200

2,000

Outlook 2007 Online Mode users

630

700

Total Outlook users

4,830

2,700

Outlook Web Access users

3,049

4,005

Outlook Web Access requests per second

77.7

101.6

Exchange ActiveSync users

3,432

4,246

Exchange ActiveSync requests per second

20.08

24.809

Outlook Anywhere users

5,537

7,317

Processor usage (percent)

58

70

Unused RAM (MB)

2,367

1,715

Used RAM (MB)

9,921

10,573

SMTP messages per second

24.885

26.693

Total messages per second

41

42

Average message size (KB)

82,060

82,060

Web connections

53,672

58,126

Disk I/O usage per message

16.00

24.82

Return to top

Conclusion

We found that the Client Access server and Hub Transport server roles can be combined and supported in most environments. Based on the test results, we recommend the following hardware configuration for servers that are running both the Client Access and Hub Transport roles:

  • 8 processor cores

  • 12 GB of RAM

The tests showed that a server with 8 processor cores and 12 GB of RAM provides an optimal balance of server costs and scalability. This configuration allows a combined-role server to handle significantly more load than a 4-core server if the number of clients increases and if the clients access processor resources heavily. Clients such as Outlook Web App and Exchange ActiveSync consume more of the processor resources of a Client Access server than do clients such as Outlook. Increasing processor resources and memory on a Hub Transport server allowed the server to handle an increased average message size. There was a direct correlation between an increase in average message size and greater processor use. Having more available memory also allowed the server to better handle transient events, such as transport queue growth.

Key considerations for supporting the two roles on a single Exchange server are:

  • All deployments should take advantage of the transport database maximum cache size recommendations. This helps avoid back pressure events. For more information, see the Exchange Team Blog article New maximum database cache size guidance for Exchange 2007 Hub Transport Server role.

  • Consider carefully both RAM and processor requirements when you're planning server hardware. Be sure you understand the following areas of resource constraint:

    • The Hub Transport role is primarily processor and disk intensive.

    • The Client Access server role is primarily memory intensive.

    • The combined Client Access and Hub Transport server is primarily processor constrained.

  • A single 4-core, 8-GB RAM, combined-role server can service the same load as two 2-core, 4-GB RAM, single-role servers.

  • A single 8-core, 16-GB RAM, combined-role server can service the same load as two 4-core, 8-GB RAM, single-role servers.

  • Based on these tests, we don't recommend deploying servers with 2 processor cores or 4 GB or less of RAM for use as combined-role servers.

Return to top

Additional Information

Conducting Your Own Tests

If you would like to perform your own tests, we recommend that you prepare a test plan and use the performance counters listed below to measure various loads.

To develop a test plan that is appropriate for your Microsoft Exchange Server 2007 lab environment, you can start with this Exchange Server 2007 Test Plan Template.

Recommended Counters for Performance Testing

  • ASP.NET

    • Requests Queued
  • Memory

    • Available MBytes

    • Pages/sec

  • MSExchange ActiveSync

    • Average Ping Time

    • Average Request Time

    • Ping Commands Pending

    • Ping Commands/sec

    • Requests Total

    • Requests/sec

  • MSExchange Availability Service

    • Availability Requests (sec)
  • MSExchange Database(EdgeTransport)

    • Database Cache % Available

    • Database Cache Size (MB)

    • Database Cache Size Max

    • Database Cache Size Min

  • MSExchange OWA

    • Average Response Time

    • Current Unique Users

    • Current Users

    • Failed Requests/sec

    • Logons/sec

    • Requests Failed

    • Requests/sec

  • MSExchangeTransport Queues

    • Items Completed Delivery Per Second

    • Items Completed Delivery Total

  • MSExchangeTransport SmtpReceive

    • Average bytes/message

    • Messages Received Total

    • Messages Received/sec

  • Processor

    • % Processor Time
  • Web Service

    • Bytes Total/sec

    • Current Connections

Return to top