ISA Server supports a range of deployment and application scenarios. The following sections describe the major scenarios and their performance characteristics.
Deployment scenarios refer to the location of an ISA Server computer within a corporate intranet. Due to security and performance considerations, several popular scenarios have evolved over the years, and the following sections describe each from a performance and capacity perspective.
Internet Edge Firewall
Organizations with enterprise-scale capacity requirements may consider deploying an ISA Server computer as a dedicated Internet edge firewall acting as the secure gateway to the Internet for all corporate clients. To maintain high throughput levels of hundreds of Mbps between the internal networks and the Internet, ISA Server can be configured to provide packet level and stateful transport layer filtering only.
The more advanced application level filtering that ISA Server provides will be enabled on the second layer of defense, which is comprised of back-end firewall ISA Server computers.
Departmental or Back-End Firewall
The next line of defense for enterprise-scale organizations includes several ISA Server computers that are deployed as departmental or back-end network firewalls that provide secure inbound and outbound access control into and out of protected LANs. Organizations with existing firewall infrastructures may keep their current high-performance firewalls at the Internet edge and offload sophisticated application layer filtering to ISA Server computers at the LAN edges. This would allow an organization to utilize current high-speed Internet connections while benefiting from the unique level of protection provided by ISA Server 2006 application layer filtering capabilities.
From a performance perspective, a departmental firewall is required to sustain only a portion of the total traffic going through the edge firewall, allowing for more resource-consuming security features to be running, such as application filters.
Branch Office Firewall
ISA Server can be used to securely connect branch office networks to a main office using site-to-site virtual private network (VPN) connections. In this deployment, ISA Server is placed at a branch office where it acts both as a firewall protecting the branch office network and as a VPN gateway connecting the branch office network to the main office network.
In general, a transport level filtered site-to-site VPN consumes only 25 percent of the processing power per unit of traffic that is required for application level filtered Internet access.
Note: |
|---|
|
In a transport level filtered site-to-site VPN, the traffic going through the tunnel is not inspected by application level filters. Application level filtering for site-to-site VPN traffic, like any other traffic, is enabled on a per-protocol basis.
|
Most traffic on the Internet and inside today’s corporate networks uses HTTP. An analysis of traffic patterns of many protocols indicates that HTTP is demanding in terms of network performance. Therefore, typical Web traffic workload simulations are realistic for measuring any firewall’s capacity and performance characteristics.
Note: |
|---|
|
One typical metric to validate network performance is the amount of transactions that are exchanged per TCP connection. Typical values for HTTP (3 to 5 on average) are low as compared to other protocols.
|
The following table summarizes the hardware recommendations for supporting HTTP traffic on three typical single-computer deployments according to Internet link bandwidth.
|
Internet link
bandwidth
|
Up to 5 T1(7.5 Mbps)
|
Up to 25 Mbps
|
Up to T3 (45 Mbps)
|
Up to 90 Mbps
|
|---|
|
Processors/Cores
|
1
|
1
|
2
|
2/2
|
|
Processor type
|
Pentium III550 MHz (or higher)
|
Pentium 4 2.0–3.0 GHz
|
Xeon 2.0–3.0 GHz
|
Xeon Dual Core
AMD Dual Core
2.0–3.0 GHz
|
|
Memory
|
256 MB
|
512 MB
|
1 GB
|
2 GB
|
|
Disk space
|
150 MB
|
2.5 GB
|
5 GB
|
10 GB
|
|
Network interface
|
10/100 Mbps
|
10/100 Mbps
|
100/1000 Mbps
|
100/1000 Mbps
|
The requirements in the preceding table are for default ISA Server 2006 installation settings, and a policy configuration containing hundreds of rules. This includes all default application and Web filtering as well as MSDE logging. The following applies to the preceding table:
-
Internet link bandwidth. The bandwidth figures apply to a demanding workload where ISA Server 2006 is utilized as a transparent Web proxy with full HTTP application layer filtering. Serving as a forward or reverse Web proxy, ISA Server may double the throughput, meaning that the minimum recommended computer for T3 bandwidth is a single Pentium 4 processor, and a dual processor computer for two T3 connections. For details about performance differences between various Web proxy scenarios, see Proxy Scenarios in this document.
In deployments requiring only stateful filtering (no need for higher application level filtering), the recommended hardware reaches LAN wire speeds. For details, see Stateful Filtering in this document.
With Web caching enabled, it is possible to lower the Internet link bandwidth by 20 to 30 percent depending on the byte hit ratio. For details, see Web Caching in this document.
-
Processors. The figures were obtained by simulating HTTP traffic on thousands of IP addresses, loading an ISA Server processor to 70 to 80 percent utilization.
-
Processor type. Other processors emulating the IA-32 instruction set that have comparable power may also be considered.
-
Memory. The memory requirements do not take into account memory space for Web caching. For information about additional memory for Web caching, see Web Caching in this document.
-
Disk space. The disk space requirements indicate the amount of free disk space that is recommended for ISA Server logs. For planning disk space requirements for Web caching, see Web Caching in this document.
-
Network interface. The network interface requirements are for the internal networks (those not connected to the Internet).
ISA Server secures HTTP traffic using its built-in Web Proxy Filter. This application filter supports three different scenarios: forward proxy and transparent proxy for protecting outbound access to the Internet for corporate users, and reverse proxy for protecting inbound access of Internet users to internal Web sites. The next sections describe each of these scenarios from a performance perspective and explain how caching can be used to improve performance.
Proxy Scenarios
This section provides scenarios for forward proxy, transparent proxy, and reverse proxy.
Forward Proxy
In forward proxy, client Web browsers are aware of the presence of the proxy. In Microsoft Internet Explorer®, for example, this is done by setting Use a proxy server or Automatically detect settings in Internet Options. When Web clients are aware of the proxy, they open connections directly to the proxy, and send the proxy requests for locations on the Internet. (For example, Internet Explorer will open two connections to the proxy when sending HTTP 1.1 requests.) When ISA Server receives a request for a server, it opens a connection to this server, and reuses it for other requests coming from other clients to the same server. This leads to a star connection topology.
The performance advantage of this scenario is that it allows for high reuse of connections, which minimizes the number of open connections as well as the connection rate.
Transparent Proxy
In transparent proxy, client Web browsers are unaware of the proxy’s presence. They sense that they are routed directly to servers on the Internet with no agent in between. Specifically, Web clients access Internet servers directly by opening connections with the target Web sites. This leads to a considerable increase in connection rate, because after a user asks for a page on a new server, the Web browser shuts down its connections with the current Web server and opens new connections with the new Web server. This is typical of transparent proxy and has an effect on ISA Server performance. Typically, the client-side connection rate in transparent proxy is approximately three times higher than in forward proxy, which consumes approximately twice as many processor cycles per request.
Transparent proxy is a popular scenario because it is easy to deploy, especially for Internet service providers (ISPs) that have a heterogeneous client base. For this reason, there are considerable performance improvements in this scenario.
In general, ISA Server requires twice the amount of CPU resources for transparent proxy as compared to forward proxy.
Reverse Proxy
Reverse proxy or Web publishing works in the same manner as forward proxy, but the direction is inbound instead of outbound. In this scenario, ISA Server acts as a Web site accessed by clients on the Internet. The clients do not know that the Web site they are accessing is actually a proxy. As with forward proxy, the number of connections and connection rate are minimal, due to efficient connection reuse. Reverse proxy is used for secure publishing of Web servers, such as Microsoft Internet Information Services (IIS), Microsoft Office Outlook® Web Access 2003, Microsoft Office SharePoint® Portal Server, and many more.
From a performance perspective, reverse proxy has characteristics similar to forward proxy. The main difference is that the major amount of traffic flows from ISA Server to Internet users, requiring a large Internet connection. As explained in the next section, forward proxy and reverse proxy have different performance impacts when Web caching is enabled.
Web Caching
Web caching is a feature for improving the performance of ISA Server in all Web proxy scenarios. But the performance improvement impact is different when enabling the cache for the outbound scenarios (forward and transparent proxy) and the inbound reverse proxy scenario.
The main difference between forward (transparent) and reverse caching is the purpose of the cache. Forward (and transparent) caching is intended to save Internet bandwidth costs and to reduce response time by placing popular cacheable content near users. Reverse caching is used for offloading the back-end Web servers. Reverse caching has no effect on response time, and will even increase latency for objects that are not cached.
In terms of savings, forward caching saves access attempts to Web servers on the Internet by serving those attempts from the cache, thus saving on required Internet link bandwidth. For example, if the cache byte hit ratio is 20 percent and peak throughput on the internal links is 10 Mbps, the peak throughput on the Internet link would be only 8 Mbps.
Note: |
|---|
|
Cache object hit ratio is the proportion of objects that are served from the cache out of the total objects that are served by the proxy. Likewise, cache byte hit ratio is the proportion of bytes that are served from the cache out of the total bytes that the proxy serves. Common average values are approximately 35 percent object hit ratio and approximately 20 percent byte hit ratio.
|
Reverse caching helps in consolidation of Web servers, reducing both hardware and management costs. For example, if 80 percent of a Web site’s data is static and cacheable, and a dynamic object requires four times more CPU cycles as compared to a static object, utilizing a reverse proxy will reduce the number of Web servers by 50 percent.
Note: |
|---|
|
Suppose a static object requires X CPU cycles, and a dynamic object requires 4X cycles. If 80 out of 100 requests are static, the total number of cycles required for 100 requests is 80X + (100-80)4X = 160X, and 50 percent of those utilized for static content will be served by an ISA Server cache.
|
Another difference between forward cache and reverse cache is the magnitude of the cached working set. In reverse cache, the size of the client working set is unlimited, but the server working set contains only several Web sites and a relatively small number of objects. In most cases, ISA Server can be designed with reasonable memory and disk space to store all the hosted cacheable content in its cache, so that only dynamic uncacheable content is directed to the hosted Web servers. Preferably, all cache can be kept and served in memory.
In forward cache, the server space contains a limitless number of Web sites and Web objects, so the cache working set is limitless. To hold such a large working set, you must define large disk caches. The next sections describe how to plan and tune Web cache capacity for forward and reverse caching.
Tuning Forward Cache Memory and Disks
In forward caching, object hit ratio and peak HTTP request rate are used to determine the number of necessary disks according to the following formula:
Number_of_Disks = (Peak_request_rate X Object_hit_ratio) / 100
For example, if peak request rate is 900 requests per second and object hit ratio is 35 percent, four disks are required.
Note: |
|---|
|
The number 100 in the preceding formula is empirical and means that the average performing physical disk (spinning up to 10,000 revolutions per minute) can serve 100 I/O operations per second. A faster disk spinning at 15,000 revolutions per minute can do 130—140 I/O operations per second.
|
We recommend using dedicated disks of the same type and of equal capacity. If a RAID storage subsystem is used, it should be configured as RAID 0 (no fault tolerance). Small disks, preferably no more than 40 GB, are recommended.
Tuning cache memory is more complicated. In cache scenarios, memory is used for:
-
Pending request objects. The number of pending request objects is proportional to the number of client connections to the ISA Server computer. In most cases, it will be less than 50 percent of client connections. Each pending request requires approximately 15 KB. For 10,000 simultaneous connections, the Web proxy memory working set has no more than 50% × 10,000 × 15 KB = 75 MB allocated for pending request objects.However in an RPC over HTTP or HTTPS publishing scenario, all connections have a pending request object. Following the previous example, a total of 100% × 10,000 × 15 KB = 150 MB is allocated for pending request objects.
-
Cache directory. The directory containing a 48-bytes entry for each cached object. The size of the cache directory is directly determined by the size of the cache and the average response size. For example, a 50-GB cache holding 7,000,000 objects (approximately 7 KB each on average) requires 48 × 7,000,000 = 336 MB.
-
Memory caching. The purpose of memory caching is to serve requests for popular cached objects directly from memory lowering disk cache fetches. But because cacheable content is unlimited in forward caching, the memory cache size has a limited effect on performance.
By default, the memory cache is 10 percent of total physical memory, and is configurable. In general, we recommend using the default setting unless hard page faults occur. Hard page faults cause severe performance degradation. The easiest way to fix this situation when using caching is to lower the size of the memory cache.
Considering this information, use the following process for tuning cache memory size:
-
Tune disk cache size, as explained in the preceding section.
-
Estimate required memory as the total of:
-
Pending request objects (50% × 15 KB × peak-established-connections).
-
Cache directory size (48 × URLs-in-cache).
-
Memory cache size (by default, 10 percent of total memory).
-
System memory requires approximately 50 MB plus 2 KB per connection (50 MB + 2 KB × peak-established-connections).
-
At least 100 MB for other processes running in the system.
-
Monitor memory usage and change memory cache size accordingly. The informative performance counters are:
\ISA Server Cache\Memory Cache Allocated Space (KB)\ISA Server Cache\Memory URL Retrieve Rate (URL/sec)\ISA Server Cache\Memory Usage Ratio Percent (%)\ISA Server Cache\URLs in Cache\Memory\Pages/sec\Memory\Pool Nonpaged Bytes\Memory\Pool Paged Bytes\Process(WSPSRV)\Working Set\TCP\Established Connections
Tuning Reverse Cache Memory and Disks
In reverse caching, working set size is so much smaller, as compared to forward caching, that it is relevant to try to put it all in memory. The size of the working set is the total amount of cacheable objects in the Web site that the cache hosts. The size of the disk and memory cache is recommended to be approximately twice the size of the working set to hold all cacheable objects, and to account for fragmentation in disk allocation and cache refresh policy. For example, a working set of 500 MB requires 1,000-MB disk cache and 1,500-MB memory with memory cache size set to 66 percent.
Because most cache fetches are served from the memory cache, the I/O rate on the disk is low. In most cases, a single physical disk is sufficient, without being a bottleneck.
Using the /3GB Boot.ini Switch
For large systems with over 2 GB of memory, Windows Server 2003 and Windows 2000 Advanced Server offer the 4GT RAM tuning feature. This feature divides a process memory space into 3 GB for application memory and 1 GB for system memory. This feature enables processes to benefit from more than 2-GB RAM in user space, and is enabled by adding the switch /3GB to the Boot.ini file. (For details, see article Q171793, "Information on Application Use of 4GT RAM Tuning," in the Microsoft Knowledge Base.)
This feature may be beneficial for ISA Server, especially for reverse caching hosting a large Web site. However, using this feature reduces the maximum size of the nonpaged pool (to 128 MB instead of 256 MB), hence the maximum number of concurrent TCP connections.
BITS Caching
Background Intelligent Transfer Service (BITS), introduced in ISA Server 2004 Service Pack 2 (SP2) and included in ISA Server 2006, enables the caching of HTTP range requests for Windows Update. BITS provides a significant savings in bandwidth consumption and shortens the delay for downloading updates in low-bandwidth deployments, which is important due to the growing demand for updates through the Web.
Measuring BITS caching performance showed the following:
-
BITS caching doubles the overall hit ratio during the monthly update process from 10 percent to 20 percent. During the monthly update, there was a savings of 18 percent of total traffic.
-
Traffic handled with BITS caching performs much better than other Web traffic, due to the extremely high average throughput per connection and hit ratio. For example, on the same hardware, BITS caching can serve three or four times more bits at the same processor utilization as compared to Web traffic handled without BITS. Enabling BITS caching for Windows Update has no negative impact on the performance characteristics of traffic that is not associated with Windows Update.
HTTP Compression
ISA Server 2006 provides a Hypertext Transfer Protocol (HTTP) compression feature. When HTTP compression is configured, ISA Server can compress content, to preserve limited bandwidth. This is useful, for example, in scenarios where a main office proxy routes Internet requests directly to the Internet, and branch offices route their requests through the main office, over a bandwidth-limited network. The feature uses a well known compression algorithm (GZip) to compress HTTP data, while compression ratio varies according to target data type. (As a default, only text-based data is compressed.)
Note: |
|---|
|
When ISA Server Web filters inspect the incoming compressed content, the compressed content will be decompressed. When decompressed, the content is stored in the cache as decompressed text. If ISA Server receives a request for the cached content, it recompresses it before sending, which increases response time.
|
Measuring HTTP compression over a 56-Kbps line with 50-millisecond (msec) latency, connecting an ISA Server computer at headquarters to multiple ISA Server computers at branch offices, and passing a total of 96 Mbps (uncompressed) data showed:
-
HTTP compression has the most impact on a slow network. It improves the network utilization by 28 percent, and thus improves the total system throughput.
-
Latency between the headquarters and branches improved by a factor of 15.
-
CPU impact added by enabling HTTP compression on the tested system (dual Xeon 2.4 GHz) was 15 percent.
Web Authentication
There are many methods for performing Web authentication, and each has its own performance impact. The following table summarizes the advantages and disadvantages of each method.
|
Authentication scheme
|
Strength
|
When authentication is performed
|
Overhead perrequest
|
Overhead per batch
|
|---|
|
Basic
|
Low
|
Per time
|
Low
|
None
|
|
Digest
|
Medium
|
Per time/count
|
None
|
High
|
|
NTLM
|
Medium
|
Per connection
|
None
|
High
|
|
NTLMv2
|
High
|
Per connection
|
None
|
High
|
|
Kerberos
|
High
|
Per connection
|
None
|
Medium
|
|
SecurID
|
High
|
Per browser session
|
None
|
Medium
|
|
RADIUS per request
|
High
|
Per request
|
High
|
None
|
|
RADIUS per time out (default)
|
Medium
|
Per time
|
Low
|
None
|
From a performance perspective, an authentication scheme performs best with no per request overhead, and a low per batch overhead. Deciding which authentication scheme to use depends on strength and infrastructure.
Also, Web proxy authentication can be configured on the Web proxy listener level or on a rule level. Choose the listener level only if authentication is required for all Web access. Otherwise, choose the rule level, which means that authentication will be performed only when necessary according to rules.
Web Filters
Like application filters, Web filters may also have an impact on performance, depending on what they do. ISA Server incorporates several Web filters that perform specified tasks. Of these, the most CPU consuming are the HTTP filter and the Link Translation filter.
An HTTP filter inspects every Web request and response, checking that they comply with normal HTTP protocol usage. It is enabled by default, and its default configuration provides size limits to HTTP headers and the URL. Other available features include blocking by methods, extensions, headers, and HTTP payload signatures. These functions have no performance impact when selected, except for signature blocking, which requires 10 percent more CPU cycles. An HTTP filter is recommended for protecting Web traffic.
Link translation is used specifically in Web publishing scenarios. It looks in HTML response bodies, searching for absolute hyperlinks, and changes them to point to the ISA Server computer instead. By default, link translation scans HTTP headers and response bodies, so there is noticeable performance impact. When body scanning is enabled, it scans by default only HTML content, causing an overall 15 percent increase in CPU utilization.
Using Secure Sockets Layer (SSL), ISA Server enhances secure publication of a variety of Web content. ISA Server, together with SSL, enables private access to published Web sites and, for corporate users, secure access to various Internal network resources, such as e-mail, shared Web sites, Terminal Services, and more.
SSL is a TCP protocol that uses port 443. SSL is also known as Secure HTTP (HTTPS), because it defines secure wrapping, authentication, and encryption for HTTP content.
From a performance perspective, SSL encryption and decryption create an additional processing layer, beyond regular HTTP processing. This layer includes the following two major CPU intensive phases:
-
SSL handshake. After establishing a TCP connection, SSL creates a security context between endpoints using public key infrastructure (PKI). This is known as an SSL handshake. In terms of aggregate network traffic, an SSL handshake consumes processing power that is proportional to connection rate (measured in connections per second).
-
Encryption. After a security context is established, an endpoint uses it to encrypt or decrypt HTTP content, using symmetric encryption. This processing is performed on each byte of HTTP data. Therefore, it consumes processor cycles proportional to aggregate network throughput (measured in megabits per second).
The ratio between aggregate throughput and connection rate determines the average number of bits that are processed for every connection. This ratio is defined as bits per connection, and in practice, every application has a characteristic value for this ratio.
The following are some examples.
Outlook Web Access
When a Web client connects to an Outlook Web Access Exchange Server front-end server, it loads the Outlook Web page that contains the user-interface icons and headers of messages currently in the mailbox. Subsequently, any operation that the user performs (such as Open, Send, or Move to Folder) generates a new HTTP connection that transfers an average of 10 to 20 kilobytes (KB). When accumulating the behavior of Outlook Web Access over many users, the Web client typically creates a relatively low bits per connection value (such as 100 kilobits per connection).
RPC over HTTP with Outlook 2003 Cached Exchange Mode
Remote procedure call (RPC) over HTTP is a feature of Microsoft Exchange Server 2003 that enables Outlook 2003 clients to access an Exchange server in the Internal corporate network from the Internet. When connecting to Exchange Server, an Outlook 2003 client working in Cached Exchange Mode typically starts with a synchronization of mailbox content with a local cache file. After the synchronization is complete, intermittent connections occur, in which new messages are transferred. For a knowledge worker using a heavy usage profile, the synchronization operation transfers many bytes of data over a small number of connections, so the overall characteristic bits per connection value is rather high (such as 500 kilobits per connection).
Note: |
|---|
|
Each RPC over HTTP client establishes approximately 10 connections, so you should also consider the total amount of connections (number of clients x 10) when planning your deployment.
|
Web Site
There are many ways to design and implement a Web site. Therefore, Web sites do not have a typical bits per connection value. However, after a Web site is serving requests, you can measure the aggregate bits per connection. In practice, Web sites have medium value bits per connection (between 100 and 500 kilobits per connection).
SSL Bridging
When you deploy ISA Server with secure Web publishing, secure Web clients on the External network can connect to the SSL port. SSL bridging is a feature of ISA Server, which enables you to specify how ISA Server communicates with the back-end Web server that is published. This feature lets you choose between the following two types of bridging:
-
SSL-to-SSL bridging. In this type of bridging, ISA Server accesses the back-end server with SSL. ISA Server performs separate SSL handshakes with the back-end server and must use encryption for every packet that it receives from or sends to the back-end server.
-
SSL-to-HTTP bridging. In this type of bridging, ISA Server accesses the back-end server in clear, unencrypted HTTP.
SSL-to-SSL bridging strengthens the security on the Internal network, but adds the processing cost of double encryption to every packet that is transferred between ISA Server and the back-end server. SSL-to-SSL bridging costs approximately 10 percent more than SSL-to-HTTP bridging.
Determining SSL Capacity
To determine what size ISA Server computer you must have to support peak network traffic loads, you must first measure the typical kilobits per connection of your network traffic and then measure the total aggregate traffic. Use the following procedure to make these determinations:
-
Use the system performance monitor tool to monitor the network traffic of each application server for the peak two hours of server activity. Collect the following counters:
\Network Interface\Bytes Total/sec. This is the counter of the interface that is published by ISA Server. Use the average value as the average throughput with the duration. This value is also used to calculate the total aggregate traffic.\TCPv4\Connections Active. The value of this counter is the total number of connections created during the monitoring session. To determine the average connections per second within this duration, you divide the difference between maximal and minimal values by the total duration. Calculate the number of kilobits per connection as: kilobits per connection = (bytes total per second × 8 per 1000) per (connections per second).
-
Determine the total average kilobits per connection as the weighed average of the kilobits per connection of each application server. The weight for each server is the throughput of that server divided by the total throughput of all servers.
-
Determine the total aggregate traffic by adding the traffic measured on each server.
-
Use the following table to determine the number of megacycles that are required for every megabit of SSL traffic that ISA Server processes, according to the kilobits per connection measured in Step 2.
|
Kilobits per connection
|
100 (Outlook Web Access)
|
200(Web)
|
500(RPC over HTTP)
|
|---|
|
1 processor, SSL to HTTP
|
91
|
77
|
69
|
|
1 processor, SSL to SSL
|
120
|
96
|
83
|
|
2 processors, SSL to HTTP
|
128
|
104
|
91
|
|
2 processors, SSL to SSL
|
142
|
120
|
104
|
-
To determine the processor speed that is required to support the total aggregate traffic, multiply the megacycles per megabit, from the table in Step 4, by the total throughput, as measured in Step 3.
Note: |
|---|
|
Because of the variety of ISA Server configurations, usage scenarios, and hardware platforms, the numbers previously cited are for estimation purposes only. For deployments with Internet link bandwidth larger than 10 megabits per second, we recommend pilot testing to verify these estimates.
|
For example, suppose that the kilobits per connection calculated in Step 2 is 200, the total aggregate throughput is 15 megabits, and you require ISA Server to perform SSL-to-SSL bridging. From the preceding table, a single processor requires 96 megacycles per megabit or 96 × 15 = 1440 megacycles for 15 megabits per second. A single Intel Pentium 4 processor running at 2.4 GHz is sufficient for this load and is used at 1440 / 2400 = 60% at peak throughput. A dual processor computer with two Intel 2.4-GHz Pentium 4 processors requires 120 megacycles per megabit or 120 × 15 = 1800 megacycles for 15 megabits per second and is used at 1800 / (2 × 2400) = 38% at peak throughput.
The following table shows the amount of traffic in megabits that a 2.4-GHz processor can process at maximum recommended usage (80 percent).
|
Kilobits per connections
|
100
|
200
|
500
|
|---|
|
1 processor, SSL to HTTP
|
21
|
25
|
28
|
|
1 processor, SSL to SSL
|
16
|
20
|
23
|
|
2 processors, SSL to HTTP
|
30
|
37
|
42
|
|
2 processors, SSL to SSL
|
27
|
32
|
37
|
This table is specifically for deployments in which ISA Server is used only for SSL traffic. If you plan to deploy ISA Server for both SSL and unencrypted HTTP traffic, you can estimate the processing power you require by calculating a weighted average of megacycles according to the amount of traffic for each scenario multiplied by the megacycles per megabit, shown in the following table.
|
Scenario
|
Transparent proxy
|
Forward proxy
|
SSL tunneling
|
|---|
|
1 processor
|
74
|
37
|
30
|
|
2 processors
|
86
|
43
|
35
|
For example, suppose that you want to deploy ISA Server in an edge firewall scenario in which 40 percent of the 20 megabit per second peak traffic is transparent proxy, 35 percent is forward proxy, and 25 percent is SSL to SSL with 200 kilobits per connection. The total amount of megacycles required for ISA Server to process this traffic on a single processor computer is:
megacycles = 20 megabits per second × (74 × 40% + 37 × 35% + 96 × 25%) = 1331
A 2.4-GHz Intel Pentium 4 processor is sufficient to process this load and is used at 1331 / 2400 = 55% at peak throughput. A dual processor computer requires 20 × (86 × 40% + 43 × 35% + 120 × 25%) = 1589 megacycles, which uses 1589 / (2400 × 2) = 33% of two 2.4-GHz Intel Pentium 4 processors at peak throughput.
A virtual private network (VPN) consists of two basic scenarios: remote access VPN and site-to-site VPN. Both can use several protocols and work in conjunction with application filtering or stateful filtering. Internet Protocol security (IPsec)-based protocols can also utilize hardware offloading capabilities available in many network adapters, improving overall processor utilization. Some protocols can work with compression for increasing throughput or saving bandwidth. All of these features impact performance, as explained in the next sections.
Remote Access VPN
Remote clients dialing in from the Internet use VPN remote access to access their corporate networks. Protocols that are used in remote access are Point-to-Point Tunneling Protocol (PPTP) and Layer Two Tunneling Protocol (L2TP) over Internet Protocol security (IPsec). Both of these protocols support compression, which is recommended because it saves bandwidth and processing power required for encryption.
To determine adequate capacity for an ISA Server VPN server, you first need to evaluate the maximum number of concurrent remote connections that your ISA Server computer needs to support. For example, if you expect to have no more than 5 percent of your organization’s employees establishing remote connections simultaneously, and your organization has 5,000 employees, 250 concurrent VPN remote access connections is the capacity you need.
The following table indicates the maximal number of concurrent VPN remote access connections supported by each hardware platform. These figures assume out-of-the-box ISA Server setup incorporating Web proxy filtering, MSDE logging, and compression for both PPTP and L2TP over IPsec protocols.
|
Protocol
|
Connectionsand bandwidth
|
Single Pentium 4 3 GHz processor
|
Dual Xeon 3 GHz processors
|
Dual Core Dual Processor Xeon 2800 GHz
|
|---|
|
PPTP
|
|
|
|
|
|
|
Connections
|
600
|
760
|
2,200
|
|
|
Bandwidth
|
9 Mbps
|
11.4 Mbps
|
33 Mbps
|
|
L2TP over IPsec
|
|
|
|
|
|
|
Connections
|
700
|
850
|
2,450
|
|
|
Bandwidth
|
10.5 Mbps
|
12.75 Mbps
|
63.75 Mbps
|
The following applies to the preceding table:
-
Bandwidth figures are the required Internet link bandwidth. The actual bandwidth is twice the amount shown in the preceding table, due to compression.
-
Bandwidth figures assume an average throughput of 30 Kbps per connection, approximately equivalent to a 56-KB dial-up connection.
In deployments where VPN clients can be trusted to a higher degree, application level filtering may be disabled, improving total capacity and loosening the security level. The next table shows the figures when Web Proxy Filter is disabled.
|
Protocol
|
Connectionsand bandwidth
|
Pentium 4, 3 GHz, Standard Edition
|
Dual Pentium 4, 3 GHz, Enterprise Edition
|
Dual Core Dual Processor Xeon 2800 GHz
|
|---|
|
PPTP
|
|
|
|
|
|
|
Connections
|
1,000
|
2,500
|
2000
|
|
|
Bandwidth
|
15 Mbps
|
38 Mbps
|
30 Mbps
|
|
L2TP over IPsec
|
|
|
|
|
|
|
Connections
|
1,000
|
2,320
|
2,000
|
|
|
Bandwidth
|
15 Mbps
|
35 Mbps
|
30 Mbps
|
The following applies to the preceding table:
-
The single Pentium 4 3-GHz processor is capable of reaching the maximum number of concurrent connections (1,000) in ISA Server 2006 Standard Edition. ISA Server 2006 Enterprise Edition has no such limit. ISA Server Enterprise Edition should run on Windows Server 2003, Enterprise Edition, because Windows Server 2003, Standard Edition has a limit of 1,000 connections.
-
IPsec offloading hardware, available in many network interface adapters, may increase throughput values by 20 percent to 25 percent. Notice, however, that for Windows Server 2003, it is available for 100-Mbps network adapters only.
-
The preceding table shows that for dual core architecture, the maximal connections number and throughput is lower than for dual processor architecture. This is related to processor affinity: When one core is at 85 percent utilization, and the other three cores are at 15 percent, a CPU bottleneck occurs. This problem can be resolved by applying interrupt affinity (using the Intfiltr.exe tool available at Windows Server 2003 Resource Kit Tools).
Site-to-Site VPN
In a site-to-site VPN, there are two main choices from a performance and capacity perspective. One choice is using either PPTP or L2TP over IPsec. These protocols provide compression of the application traffic, which doubles the throughput that can be transferred through the site-to-site link. For example, sending a 2-MB file through a PPTP or L2TP tunnel will actually pass only 1 MB. The other choice is using IPsec tunneling, which does not incorporate compression. So in effect, PPTP and L2TP over IPsec save site-to-site throughput by 50 percent, as compared to IPsec tunneling.
With Web Proxy Filter disabled, L2TP over IPsec requires a single Pentium III 750-MHz processor for 15-Mbps application traffic. Passing this traffic in one direction requires only 7.5-Mbps link capacity due to compression. A single Pentium 4 3-GHz processor can handle up to 90-Mbps application traffic requiring T3 link capacity (45 Mbps). When Web Proxy Filter is enabled, a Pentium III 750-MHz processor can sustain 7-Mbps application traffic requiring 3.5-Mbps Internet link bandwidth, while a single Pentium 4 3-GHz processor handles 34-Mbps application traffic corresponding to 17-Mbps Internet bandwidth. Dual Xeon 3-GHz processors can handle 53-Mbps application traffic requiring 26.5-Mbps Internet link bandwidth. PPTP can handle approximately 15 to 20 percent more throughput for the same CPU consumption.
The second choice is using IPsec tunneling, which does not support compression, meaning that Internet link traffic is the same as application traffic. When working in conjunction with stateful filtering (Web Proxy Filter is disabled), IPsec tunneling can handle 10 Mbps on a single Pentium III 550-MHz processor and 52 Mbps on a single Pentium 4 3-GHz processor. With Web Proxy Filter enabled, the throughput figures are 4 Mbps, 18 Mbps, and 30 Mbps for the single Pentium III, single Pentium 4, and dual Xeon platforms respectively.
The following table summarizes these results—the supported actual megabits per second at 75 percent CPU utilization. (The numbers in parenthesis represent the uncompressed traffic volumes.)
|
Site-to-site VPN method
|
Filtering
|
Pentium 4, 3 GHz
|
Dual Pentium 4, 3 GHz
|
Dual Core Dual Processor Xeon 2800 GHz
|
|---|
|
L2TP over IPsec (compressed)
|
|
|
|
|
|
|
Disabled
|
45 (90)
|
71 (142)
|
55 (110)
|
|
|
Enabled
|
17 (34)
|
27 (53)
|
25 (50)
|
|
PPTP over IPsec (compressed)
|
|
|
|
|
|
|
Disabled
|
52 (104)
|
81 (162)
|
88 (176)
|
|
|
Enabled
|
20 (39)
|
31 (61)
|
35 (70)
|
|
IPsec tunneling
|
|
|
|
|
|
|
Disabled
|
52
|
87
|
94
|
|
|
Enabled
|
18
|
30
|
33
|
IPsec offloading hardware, available in many network interface adapters, may increase throughput values by 20 percent to 25 percent.
Multiple Branches Site-to-Site VPN
In many organizations, it is common to find a scenario that has multiple branch offices and a single headquarters office. In this configuration, multiple branch office computers are connected to the same computer at headquarters. This is known as a star topology.
A measurement of that scenario showed that a single ISA Server computer at headquarters can support up to 60 concurrently connected branch office computers, with a traffic rate of 200 Kbps. The hardware used for that test was a Quad processor AMD 2.4 GHz with 2 GB RAM. The VPN tunnels were created using L2TP over IPsec.