Optimizing Office SharePoint Server for WAN environments

Applies To: Office SharePoint Server 2007

This Office product will reach end of support on October 10, 2017. To stay supported, you will need to upgrade. For more information, see , Resources to help you upgrade your Office 2007 servers and clients.

 

Topic Last Modified: 2016-11-14

In this article:

  • Designing topologies for the WAN

  • Optimizing the content crawling process

  • WAN accelerators and other third-party tools

  • Designing pages for quicker downloads

  • Optimizing caching for WAN environments

This article highlights specific ways in which you can optimize your Microsoft Office SharePoint Server 2007 solution for better performance in wide area network (WAN) environments.

Designing topologies for the WAN

You can flexibly configure server roles in an Office SharePoint Server 2007 farm to optimize for specific performance or availability requirements. In a WAN environment, there are several technical characteristics of server roles that are important to understand. In addition to planning for your overall performance and availability requirements, understanding these characteristics can help you optimize your server farm topology for WAN environments.

Optimizing the topology for crawling content

By default, Office SharePoint Server 2007 uses all Web servers to crawl content in the server farm. When your server farm is configured to use all Web servers for crawling, the index server sends requests to each Web server in the farm.

Crawling content in your farm places a heavy load on the Web servers. This tends to cause spikes and surges in network traffic, and is CPU-intensive and memory-intensive. Crawling content in your farm can potentially cause more traffic on the network than user requests cause. This network traffic can negatively affect the performance of all Web servers in your server farm and thereby increase the time to respond to user requests for content from your SharePoint sites.

In WAN environments, we recommend that you configure a dedicated Web server for crawling content, especially if you are crawling a server farm that contains more than 500 gigabytes (GB) of content or if you are crawling content over the WAN. To ensure that user requests are not affected by content crawling, remove the dedicated Web server from the network load balancing rotation. This is especially important in global environments in which the off-peak hours of a regional farm—when you are most likely to schedule crawl jobs—coincide with the peak hours of the central farm.

Configuring a dedicated Web server for crawling local content

For best performance when crawling content that is local to the farm, configure the index server as the dedicated Web server for crawling if your index server has the memory capacity for both roles.

The following figure illustrates a farm that is optimized with a dedicated Web server for indexing content.

SharePoint Server in WAN topology

By using the same server as both the index server and dedicated Web server, you eliminate the need for the index server to send requests to a different server when crawling content. Consequently, you boost crawl performance and reduce the overall traffic on the network. If this is not possible, you can use a different server in your server farm.

You can configure the indexer to use the dedicated Web server by using either of the following approaches:

  • Configure the Office SharePoint Server Search Service in SharePoint Central Administration.

  • Edit the Hosts file directly.

For more information, see the following resources:

Configuring a dedicated Web server for crawling remote farms

For best performance of regional farms when crawling content that is on a regional farm, configure the index server on the central farm to use a dedicated Web server on the regional farm.

The following figure illustrates two regional farms that are each optimized with a dedicated Web server for indexing content.

  • On regional farm 1, a server is dedicated to the Web server role. Both the central farm and regional farm 1 use this dedicated Web server for crawling content.

  • Regional farm 2 is configured similarly as the central farm with the Web server role also deployed to the index server.

Optimize Office SharePoint Server for WAN

Note that the index server on the central farm does not use the Web server role on the central farm to crawl content on a remote farm. The index server directly contacts the Web servers on remote farms.

In the previous figure, if regional farms are crawling content at the same time that the central farm is also crawling, performance will be better with the configuration of regional farm 1 in the following ways:

  • Local crawl performance on regional farm 1 is improved because the dedicated Web server is on a separate server. Consequently, the central farm is not affecting the performance of the index server.

  • Crawl times over the WAN are improved. Crawling takes less time because the dedicated Web server on the regional farm is more responsive than it would be if it were on a shared server with the index server.

  • The crawl process of the central farm is improved because the index server on the central farm is communicating with a dedicated Web server.

If you implement the topology configuration illustrated by regional farm 2, you can optimize performance by scheduling crawl processes on the two farms so they don't overlap.

To use a dedicated Web server on a remote farm to crawl content, you must edit the Hosts file directly. Be sure to edit the Hosts file on the farm that is crawling the content, not the remote farm.

In a global solution in which a central farm is crawling content on regional farms, edit the Hosts file to include an entry for each Web application on each regional farm that you want to crawl. An entry in the Hosts file includes the IP address of the dedicated Web server followed by the name of the Web application. For example:

  • 10.10.10.4 TeamSites

  • 10.10.10.4 MySites

  • 10.10.10.4 Marketing

  • 10.10.10.4 Sales

For more information, see Configure a dedicated front-end Web server for crawling by editing the Hosts file (Office SharePoint Server 2007).

Optimizing for query performance

Query performance for users is a key consideration when deploying Office SharePoint Server 2007 in a WAN environment. When a user issues a query, the query is sent to a Web server. The Web server communicates with the query server to build a list of results, and then communicates with the computer running Microsoft SQL Server to extend the list of results with summarization text, URLs, and security trimming.

Given the server-to-server communication required to return query results, you can configure the topology to optimize query performance. Small optimizations within the server farm can boost the overall perceived performance between client computers and servers over the WAN.

The most important optimization is to use a dedicated Web server for crawling content, as discussed in the previous section. This ensures that Web servers are available for user requests and not overloaded with indexing jobs.

Next, there are several different options for deploying the query role, each providing a different type of optimization. Each option strikes a different balance between optimally serving query requests and reducing networking trips on the local network between farm servers. The following table summarizes these configuration options and notes the tradeoffs for each option.

Configuration Tradeoffs

Deploy the query role across all Web servers.

You can host the query role on Web servers to reduce round-trip communications between servers in the farm. As a result, query performance is optimized.

However, because two server roles are hosted on the same servers, the overall performance of Web servers can be affected if there is heavy use of these servers. Consequently, ensure that you deploy enough Web servers to process user requests during peak hours.

Although this configuration optimizes query performance for users, the tradeoff is on the back-end of the farm. The index server propagates the index catalog to each query server in a farm. If the query role is deployed across multiple Web servers, this operation requires greater server resources during index propagation.

Dedicate one or more servers to the query role.

Dedicating one or more servers to host the query role optimizes these servers to use all available resources to perform the role efficiently. This configuration also typically results in deploying fewer query servers.

However, this configuration requires a greater number of round-trip communications between servers in the farm to serve query requests from Web servers and to update content indexes during index propagation.

Deploy the query and index roles to the same server.

You can deploy the query role and the index role on the same server. This optimizes farm communication because index propagation is no longer necessary.

However, this configuration limits the number of servers that can host the query role to only one server. This is because when the query role is deployed on an index server, the index server loses the ability to propagate content indexes to other servers in the farm.

In addition to optimizing a single farm for query performance, there are several options for optimizing multi-farm scenarios:

  • In an inter-farm shared services scenario in which a parent farm is providing the search service for a child farm, deploying a dedicated query server on the parent farm enables the parent farm to handle search queries from child farms without affecting the performance of other user options at the parent farm. In an inter-farm shared services scenario, the Web server on the child farm directly contacts the query server and the database server on the parent farm, rather than communicating through the Web server on the parent farm. Note that inter-farm shared services in which a parent farm is providing services to a child farm is not supported over a WAN. However, large organizations might include a central site with a parent farm that provides services and a child farm that provides sites and content. In this scenario, the parent farm can be configured to crawl content on regional farms in a multi-farm geographically distributed environment without affecting the performance of the child farm at the central site that is providing sites and content to users at the central site.

  • In a multi-farm geographically distributed environment in which a central farm is crawling content at regional farms, you can optimize both content crawling and query performance by configuring servers in the following ways:

    • Deploy the Web server role to the index server at all farms. Remove this server from the network load-balancing rotation and configure the parent farm crawl process to use this dedicated Web server to crawl content.

    • Deploy the query server across all load-balanced Web servers on each farm.

Separating farm servers geographically

Communication between servers in a server farm requires robust network connections to adequately serve user requests and to avoid performance bottlenecks. The standard networking requirement is that all servers within a server farm reside within the same data center on the same local area network (LAN). Separating farm servers across WAN links is not supported.

In addition to this guidance, there are specific requirements that enable one or more Web servers to reside in a different geographical location than the rest of the farm servers. In this scenario, the Web servers are located in a different data center but are connected to the same LAN as the computer running SQL Server,

Note

The following guidance represents supportability requirements for a previously undocumented deployment scenario.
The guidance for this scenario is preliminary and subject to change after further testing. This guidance provides a set of guidelines you can use to test and deploy until official tested guidance is provided. Use these guidelines as a basis for testing in your own environment and to determine your own threshold of quality and performance.
This scenario is supported with commercial reasonable support. If further testing or results within your own environment indicate that this scenario introduces significant issues, you must move Web servers closer to the database server if this resolves the issues.

This deployment scenario is expected to meet preliminary supportability requirements when the following conditions are met:

  • The network link between a Web server and the database server is less than 1 milliseconds (ms) latency. To achieve this latency, ideally the Web server is located 10 or fewer miles from the database server. Under the most optimal networking configurations, less than 1 ms latency can be achieved at distances up to 100 miles, although this is rare. If the distance is 10 miles through 100 miles, consult with your network and hardware providers to see if they can provide a service level of less than 1 ms latency. Separating farm servers by distances of greater than 100 miles is not supported. When measuring the latency between two data centers that host servers of the same farm, use the Ping tool to measure latency from a Web server in the remote data center to the database server in the primary data center. Divide the round-trip result by two.

  • There is sufficient available bandwidth on the link to handle the traffic between the Web server and other servers in the farm.

  • All server roles that contribute to shared services are located in the same datacenter as the database server. This includes the index, query, and Excel Services roles.

  • All server computers in the farm are on the same network segment. There is no switching in routers at the data layer. (Literally, the physical layer connects the servers.) Routers and switches will increase latency even if the network connection between these is very fast.

  • If the type of load the Web server is serving is some subset of user browse requests, we expect Office SharePoint Server 2007 to tolerate some latency between the Web server and the database server. On the other hand, pages with many or custom Web Parts, Stsadm commands, and search crawls are likely to fair less well.

  • Servers within the server farm do not cross time zones. All server computers within a server farm must be synchronized to the same time zone.

Optimizing the content crawling process

The way in which you schedule and configure crawling processes can affect performance and reliability. The following processes can be optimized to improve crawling over the WAN:

  • Coordinating crawling for content sources.

  • Configuring crawl frequencies and coordinating full and incremental crawls.

  • Configuring crawl settings.

Coordinating crawling for content sources

In global environments in which a central farm is crawling content on regional farms over WAN links, it is important to plan content sources because these represent the units of content that can be scheduled and managed.

You add a content source for search when you need to:

  • Crawl different types of repositories.

  • Crawl some repositories on different schedules than others.

  • Limit or increase the quantity of content that is crawled.

Each of these reasons can be factored into optimizing content crawling over the WAN. At a minimum, create a different content source for each regional farm. This enables you to schedule crawling on each regional farm based on the farm's local off-peak hours and maintenance schedule.

Additionally, create different content sources for a single regional farm based on the following criteria:

  • Create separate content sources for content that you want to crawl more frequently (such as collaborative content) or less frequently (such as published content).

  • Create a separate content source for content that is published on a regular schedule. For example, if you know that content on a specific set of sites is updated on only Fridays, you can create a separate content source to synchronize crawl schedules with content updates.

  • Create content sources based on how much content can be crawled over a WAN link during a regional farm's off-peak hours. For example, if your goal is to crawl a large repository once a week, you can divide the repository into five to seven content chunks that can be successfully crawled overnight, and then stagger crawl jobs over the course of each week.

  • Create a separate content source for content that is external to Office SharePoint Server 2007 sites. In Office SharePoint Server 2007 and Windows SharePoint Services 3.0, the change log records changed objects, including updates to access control lists (ACLs). The change log provides the ability to incrementally crawl only content that has changed, greatly reducing the amount of time required to re-crawl a content source. Content stored in external sources cannot be incrementally crawled. Consequently the crawl process for these content sources should be managed separately. For more information, see Change Log (https://go.microsoft.com/fwlink/?LinkId=106007&clcid=0x409).

For more information about planning for crawling and content sources, see Plan to crawl content (Office SharePoint Server).

Configuring crawl frequencies and coordinating full and incremental crawls

As discussed in the previous section, a primary reason to create separate content sources is to create the ability to crawl content on different schedules. This is especially important in environments in which content is crawled across WAN links with high latencies. When scheduling crawl jobs for regional farms, take into account the following factors:

  • Scheduled downtimes and periods of peak usage on the regional farm.

  • Frequency with which content is changed or updated.

  • Bandwidth availability and latency of the WAN link. Be sure to account for other processes that use the WAN link.

For each content source in Office SharePoint Server 2007 and Windows SharePoint Services 3.0, you can specify a time to do full crawls and a separate time to do incremental crawls. Note that you must run a full crawl for a particular content source before you can run an incremental crawl. If you choose an incremental crawl for content that has not yet been crawled, the system performs a full crawl.

When you plan crawl schedules for a WAN environment, consider the following best practices:

  • Stagger crawl schedules so that the use of WAN links is distributed over time and so that WAN links are not saturated.

  • Schedule full crawls only when necessary. We recommend that you do full crawls less frequently than incremental crawls.

  • Schedule incremental crawls for each content source during times when the servers that host the content are available and when there is low demand on the resources of the server.

  • Some administrative changes in Office SharePoint Server 2007 and Windows SharePoint Services 3.0—for example, applying a service pack or restoring a database—automatically trigger a full crawl during the next regularly scheduled crawl. Schedule administrative changes that require a full crawl to occur shortly before the planned schedule for full crawls. For example, we recommend that you attempt to schedule the creation of the crawl rule before the next scheduled full crawl so that an additional full crawl is not necessary. For more information about administrative changes that necessitate a full crawl, see "Reasons to do a full crawl" in Plan to crawl content (Office SharePoint Server).

Important

After applying the Infrastructure Update for Microsoft Office Servers, future updates and restoring databases will not automatically trigger a full crawl. Therefore, when you apply future updates or restore a database, Search continues crawling based on the regular schedule defined by crawl rules. For more information see Plan to crawl content (Office SharePoint Server).

In addition to these best practices that apply to crawling content at individual regional sites, ensure that the overall crawl schedule of a central farm does not overwhelm the index server. Base simultaneous crawls on the capacity of the index server to crawl them. The performance of the index server and the servers hosting the content determines the extent to which crawls can be overlapped. A strategy for crawl scheduling can be developed over time as you can become familiar with the typical crawl durations for each content source.

For more information about planning for crawling and content sources, see Plan to crawl content (Office SharePoint Server).

Configuring crawl settings

You can optimize several specific crawl settings for WAN environments to increase the reliability of crawl jobs. The following settings are available:

  • Time-out settings for search

  • Crawler impact rules

Farm administrators can specify the amount of time they want the server to wait while connecting to other services, and how long the server will wait for a request for content to be acknowledged. If you add content sources that are crawled over WAN links, increase the time-out settings as a proactive measure based on the overall latency of the link. You can adjust the time-out settings at any time based on the actual performance of crawling content over a WAN link.

Use the following procedure to specify the time-out settings for the Office SharePoint Server Search service.

Specify time-out settings

  1. In Central Administration, on the Application Management tab, in the Search section, click Manage search service.

  2. On the Manage Search Service page, in the Farm-Level Search Settings section, click Farm-level search settings.

  3. On the Manage Farm-Level Search Settings page, in the Timeout Settings section, do the following:

    • In the Connection time (in seconds) box, type the number of seconds you want the server to wait while connecting to other services.

    • In the Request acknowledgement time (in seconds) box, type the number of seconds you want the server to wait for another service to acknowledge a request to connect to that service.

  4. Click OK.

Crawler impact rules

Crawler impact rules provide a way of controlling the number of documents that are requested and crawled at one time. Crawler impact rules enable administrators to manage the impact crawl jobs have on WAN links.

For each crawler impact rule, you can specify a single URL, or you can use wildcard characters in the URL path to include a block of URLs to which the rule applies. You can then specify how many simultaneous requests for pages are made to the specified URL, or you can choose to request only one document at a time and wait a number of seconds that you choose between requests.

During initial deployment, set the crawler impact rules to use WAN links as efficiently as possible while still crawling enough content frequently enough to ensure the freshness of the crawled content. Later, during the operations phase, you can adjust crawler impact rules based on your experiences and data from crawl logs.

Use the following procedure to add a crawler impact rule.

Add a crawler impact rule

  1. In Central Administration, on the Application Management tab, in the Search section, click Manage search service.

  2. On the Manage Search Service page, in the Farm-Level Search Settings section, click Crawler impact rules.

  3. On the Crawler Impact Rules page, click Add Rule.

  4. On the Add Crawler Impact Rule page, in the Site section, in the Site box, type the site name that will be associated with this crawler impact rule.

    Note

    When typing the URL, you must exclude the protocol. For example, do not include http:// or file://.

  5. In the Request Frequency section, choose one of the following options:

    • Request up to the specified number of documents at a time and do not wait between requests. If you choose this option, use the Simultaneous requests list to specify how many documents you want the crawler to request at one time when crawling this URL. You can specify the maximum number of requests that the Office SharePoint Services Search service can make at one time when crawling this URL.

    • Request one document at a time and wait the specified time between requests. You can specify a delay (in seconds) between requests, when crawling this URL. When this option is selected, the Office SharePoint Server Search service makes one request per site at one time, and then it waits for the specified amount of time before making the next request. In the Time to wait (in seconds) box, type the time, in seconds, to wait between requests. The minimum time to wait between requests is one second, and the maximum time is 1,000 seconds.

  6. Click OK.

For more information about crawler impact rules, see the following articles:

WAN accelerators and other third-party tools

This section describes options for optimizing WAN environments with third-party solutions in the following categories:

  • WAN accelerators

  • Offloading and cache devices

  • Client solutions

  • Data replication, multi master synchronization, and configuration management

  • Multi-farm manageability and reporting

  • Byte-level or hardware-based replication

Because each environment is different, we do not recommend specific partner solutions. Moreover, partner solutions address opportunities in different ways. Consequently, each solution has different strengths. It is important to evaluate each solution based on the specific needs of your environment and the relative strengths of the partner solution.

There are many partners who offer solutions to enhance or optimize Office SharePoint Server 2007 solutions. For an updated list of partners, see Microsoft Office System Solutions Directory (https://go.microsoft.com/fwlink/?LinkId=108591&clcid=0x409).

WAN accelerators

WAN acceleration solutions have been around for a long time. Shortest path algorithms and packet compression tools have been offered for decades. The biggest innovations in recent years target optimization of the TCP/IP stack and Server Message Block (SMB).

Most WAN accelerators work in pairs, with a device installed in the data center next to the servers running SharePoint Products and Technologies and another device in the branch office. The two devices optimize WAN traffic by using caching, compression, differencing, and proprietary methods to optimize the packets that are sent between the two devices. Whether these are inline devices or simply network equipment with upgrades for cache, the approach is similar. Different partner solutions focus on optimizing different layers within the network stack.

Two important criteria to consider when choosing a network accelerator include:

  • The security requirements of your organization. Requirements such as IPsec or HTTPS will influence the options.

  • The applications used in your organization. If you want a device that also provides optimization for Microsoft Exchange Server and file share traffic, this will influence your options as well.

Example solutions include: Cisco, Citrix, Packeteer, Riverbed, and F5.

Offloading and cache devices

While caching techniques within SharePoint can reduce unnecessary backend traffic, partners that provide offloading and cache devices can help bridge the gap, including WAN connections, between the client and servers.

If you are hosting your SharePoint site over the Internet and your goal is to optimize network traffic and reduce the amount of requests that hit your servers, then offloading and cache devices can play a role. There are a variety of partners that target solutions for optimizing the process of hosting content that is exposed to the Internet. Strategies that are employed in this space include caching and related proprietary techniques, offloaded compression with varying algorithms, warm ups and pre-fetching, and various shopping cart techniques. Some partners excel at delivering content securely and efficiently to specific types of clients, such as public kiosks, computers in Internet cafes across the globe, or other slim devices that are not well-connected.

Also in the Internet arena are partners that provide global caching and network optimization routing techniques for reducing dropped packets. For example, some solutions optimize network traffic so only the deltas within client requests are sent to the server. These types of solutions result in less WAN traffic and can also results in quicker page returns wherein the number of round-trip communications between the client and server or between other intermediary devices are reduced.

Similar to Microsoft ISA, some solutions provide offloaded or delegated authentication as a gateway for accessing information. These solutions add an additional layer of security. To address multiple requirements, look for products or solutions that provide a firewall, load balancing, plus some intelligence for offloading and caching. Expect to see even more consolidation of these types of features in the future.

Example solutions include: Cisco, F5, Inktomi, Microsoft ISA Server, and Microsoft Internet Application Gateway.

Client solutions

Some partners focus on optimizing the client experience, rather than on addressing the network and server infrastructure. Techniques such as pre-fetching, background synchronization, compression, ad blockers, and image filters can dramatically increase the performance of retrieving content on the Internet. This is especially true if text is the primary target and you can manage without images.

There are several client applications that allow users to synchronize with SharePoint sites automatically. After the client initially synchronizes with a site, the client application automatically caches the contents of the site on the client computer in the background or when the client is online. For example, when a user clicks on a document, the document is already locally available and the user is not affected by WAN links. Similarly, when a user adds or updates a document, the client application takes care of synchronizing the changes with the online site. These client applications typically manage any conflicts that arise and allow users to decide how to resolve conflicts.

Some clients handle this experience better than others. Some provide support for files only. Some applications provide support for both lists and files. You likely won't find offline Wikis, but you can find RSS readers for consuming most lists locally or offline. Office Outlook 2007 is an example of a client that allows you to consume SharePoint blogs or lists offline by using the RSS with RSS reader, in addition to synchronizing with document libraries. Office Groove 2007 also provides a good offline experience and adds the capabilities of peer-to-peer collaboration and file compression across a WAN. For more information on Microsoft client solutions, see Extending Office SharePoint Server global solutions with Office Outlook 2007 and Office Groove software.

Partners in this space have focused on optimizing the user experience where WAN links affect performance or where clients are frequently offline. Caching (local copies), compression, and moving the synchronization to the background can make it seem like you're on the LAN with the server. If you choose to offer client applications to your users, be sure to provide adequate training so your users can work as efficiently as possible.

Partners: Colligo. Microsoft: Microsoft Office Outlook 2007, Office Groove 2007

Data replication, multi master synchronization, and configuration management

Whether it is because of slow WAN links between two offices or disaster recovery requirements with a multi-master requirement, replication is often necessary in deployment plans. SQL Server 2005 provides log shipping and database mirroring for disaster recovery or site failover. However, when you require two separate server farms that both provide a read/write access, there are partners who provide solutions.

Some partner solutions include a server cache similar to a WAN accelerator. The solutions continue to provide content from the cache at a remote site if a WAN link fails. Other partners excel at synchronizing data when sites are connected after extended periods of being disconnected. For example, a ship that arrives to a dock after being at sea can synchronize with a central site.

Some partners extend the SharePoint Products and Technologies interface to configure replication for pairs of Web applications, site collections, or lists.

Note

The publishing features of Office SharePoint Server 2007 have not yet been tested by the product team in WAN environments. The publishing features might provide some value in publishing content from a central farm to read-only environments. However, without test results, we cannot provide specific guidance for this scenario.

Partner solutions include: Syntergy, WinApp Technologies, Casahl, and Infonic.

Multi-farm manageability and reporting

In global deployments that include multiple server farms, managing settings across the farms and sites can be a challenge. There are several partners that offer tools designed for streamlining the management of configuration settings, permissions management, effective user rights, and content elements such as master pages and content types. If you decide to deploy multiple server farms into your environment, consider partner tools that can help manage multiple farms and large volumes of sites.

Some partners focus on helping you configure settings across farms and multiple environments. The SharePoint Cross Site Configurator (https://go.microsoft.com/fwlink/?LinkId=108592&clcid=0x409) is an example of a tool designed by Microsoft to configure auditing, expiration, master pages, and content types for consistency across Web applications.

Partner solutions include: Quest Software, echoTechnologies, IDevFactory, AvePoint , CorasWorks, Barracuda Tools, CommVault, and Symantec.

Byte-level or hardware-based replication

Partners that provide hardware-based or byte-level replication make it very easy to fail over and to synchronize environments between data centers. If you implement a shared disk such as a storage area network (SAN), the shared disk can become a point of failure. Hardware vendors use various methods for providing redundant channels, redundant fiber, redundant disks, and various array configurations. Different solutions provide varying levels of fault tolerance.

If you want to eliminate hardware as the potential source for failure, evaluate Microsoft Cluster Services (MSCS). MSCS provides hardware-based fault tolerance. Software-based failover solutions such as SQL log shipping and SQL mirroring provide hardware fault tolerance, but failover is not automatic when used with SharePoint Products and Technologies.

In some cases, implementing a solution that provides replication at a lower level in the stack can address specific business needs. Byte-level replication, which creates a clone or a mirror of the primary environment, can also create a secondary environment to fail over to. Continuous byte-level replication can provide a means for either automatic or manual failover.

An important caution when evaluating these types of replication solutions is to understand that server names, Web application names, and accounts are hard coded in the configuration database. This means any service that is replicated on a different server with a different name does not work. If the server name remains the same in the replicated environment as the primary environment, these types of solutions can work. Regardless of the solution, if a tool provides replication outside of the knowledge of the application, the tool must be tested to ensure it works in a failover environment.

Partner solutions include: Neverfail and Double-take.

Solutions that are built into the hardware, such as SAN-based replication, include: HP, EMC Centera, and Hitachi Data Systems.

Designing pages for quicker downloads

In an environment with limited network capacity, streamline your pages and make them as small and responsive as possible. There are different techniques to do this, most of which are not specific to SharePoint Products and Technologies. The general methods can be used on any Web site are not be discussed in great detail in this section. Instead, the section focuses on understanding the features included in SharePoint Products and Technologies, what is included in the page, and ways in which you can speed up the initial visit to a SharePoint site.

Page elements

A page on a SharePoint site consists of several unique elements, as shown in the following figure.

SharePoint site page with control overlay

When the page is rendered, it brings together the master page, the layout page, and content for the page. The page content includes the values for each of the page fields, but also a number of other elements such as the theme, style sheets, images, and navigation. The following table shows an example of the files and streams that are present in a single page from a SharePoint site. This example is a capture of all the HTTP requests that were made on an initial visit to the default home page of a collaboration portal site.

URL Size (bytes)

http://myServer/_layouts/images/topnavhover.gif

96

http://myServer/Pages/Default.aspx

1656

http://myServer/Pages/Default.aspx

1539

http://myServer/Pages/Default.aspx

66084

http://myServer/_layouts/1033/styles/controls.css?rev=EhwiQKSLiI%2F4dGDs6DyUdQ%3D%3D

1448

http://myServer/_layouts/1033/styles/HtmlEditorCustomStyles.css?rev=8SKxtNx33FmoDhbbfB27UA%3D%3D

642

http://myServer/_layouts/1033/styles/HtmlEditorTableFormats.css?rev=guYGdUBUxQit03E2jhSdvA%3D%3D

1317

http://myServer/_layouts/1033/styles/core.css?rev=5msmprmeONfN6lJ3wtbAlA%3D%3D

13596

http://myServer/_layouts/1033/init.js?rev=VhAxGc3rkK79RM90tibDzw%3D%3D

15732

http://myServer/_layouts/1033/core.js?rev=F8pbQQxa4zefcW%2BW9E5g8w%3D%3D

54367

http://myServer/_layouts/portal.js?rev=INhSs9mWTnUTqdwwIXYMaQ%3D%3D

954

http://myServer/_layouts/1033/ie55up.js?rev=Ni7%2Fj2ZV%2FzCvd09XYSSWvA%3D%3D

20508

http://myServer/_layouts/1033/search.js?rev=yqBjpvg%2Foi3KG5XVf%2FStmA%3D%3D

5092

http://myServer/_layouts/1033/EditingMenu.js?rev=eh0f0CwzvHQ7Ii0JvdsIjQ%3D%3D

2735

http://myServer/WebResource.axd?d=__WrA1TRLicJgwGEmYKqSA2&t=633214754549731034

5383

http://myServer/WebResource.axd?d=h_u9v0Coj_eDqsvEkDrdtw2&t=633214754549731034

8258

http://myServer/_layouts/images/blank.gif

43

http://myServer/_layouts/images/helpicon.gif

1025

http://myServer/_layouts/images/Menu1.gif

68

http://myServer/_layouts/images/titlegraphic.gif

1299

http://myServer/_layouts/images/gosearch.gif

19933

http://myServer/WebResource.axd?d=puevA5kEAx44yxozBd-hspPZ9eA51Rh9u95VwVGLFCc1&t=633214754549731034

224

http://myServer/WebResource.axd?d=wyTuS1folQ6wX2Tc_7NOOaeElHHqL6rtdeAeRRUR36s1&t=633214754549731034

218

http://myServer/_layouts/images/whitearrow.gif

68

http://myServer/_layouts/images/recycbin.gif

1004

http://myServer/PublishingImages/newsarticleimage.jpg

10710

http://myServer/_layouts/images/icongo01.gif

1171

http://myServer/_layouts/images/menudark.gif

68

http://myServer/_layouts/images/topnavhover.gif

96

Note the following:

  • A total of 29 requests were required to download the page.

  • The total page size was 235 kilobytes (KB).

  • This represents the initial page load; almost all of the items in the request have a caching directive that instructs the browser not to load them again for one year. The second and subsequent page loads produce only three requests. Of those, two are part of the NTLM negotiation that occurs, so only one item is actually downloaded—the HTML for the page.

  • The default IIS Compression level 0 was used, which is the least amount of compression possible. Additional compression would result in even smaller download sizes.

  • Of the different file types loaded, there were:

    • 4 .axd resource requests

    • 4 .css resource requests

    • 12 image resource requests

    • 6 .js resource requests (several of which were duplicates)

    • 3 page resource requests for default.aspx (two of which are part of the NTLM negotiation)

Most of these file types are fairly obvious, with the possible exception being the .axd resource type. An .axd resource is part of a new feature in ASP.NET version 2.0. A developer can add a resource, such as script file or style sheet, to a control. In the control, the developer uses the ClientScript class to include a method called GetWebResourceUrl. When the control is rendered at runtime, it dynamically generates a URL for the resource. The resource itself is compiled into the control assembly, so this methodology provides a way to stream that resource out of the assembly and down to the client just as if it were a separate file located on the Web server.

Knowing the resource requests used by the page can help you understand where and how optimizations can be applied. You can measure this kind of information by using a variety of different tools and techniques. For this article, a freeware tool called Fiddler was used. Fiddler can be run on a client workstation and tracks all of the HTTP requests being made for a page. It then displays the results in a grid, as shown in the following figure.

Fiddler results for SharePoint site

As you change your site to optimize it, test it with Fiddler. To get the most accurate idea of what items are being requested, what items are being cached, and the size of each item:

  1. Delete all of your browser's temporary files.

  2. Start Fiddler.

  3. Request your page.

    Note

    Be sure that you request the page by clicking a link. If you just click the Refresh button, the system automatically requests items again and won't accurately reflect any optimization changes that have been made.

Optimizing Page Downloads

After you understand the composition of the page, you can use different methods to optimize the download experience for that page. In general, the goal is to minimize the number of round trips between client computers and server computers and to reduce the amount of data that goes over the network. The guidance in this article includes recommendations that can be applied broadly to a variety of different implementations of SharePoint Products and Technologies.

There is an important distinction to remember when reviewing these recommendations and when reviewing any other custom optimizations that you might develop. You categorize page optimization techniques into one of two categories: first page request and subsequent page request. Optimizations for the first page request are those kinds of optimizations that are implemented the first time the page is requested, but that don't necessarily affect subsequent page requests. Subsequent page request optimizations are those that can improve the user experience whether it is the first time a user requests the page or the fiftieth time. The key is that you need to balance loss in functionality against the gain achieved. If gain is only realized the first time a user hits a site, the optimization might not be worth the loss in functionality.

BLOB cache

The binary large object (BLOB) cache is discussed in more detail later in this article. In short, it can be used to apply cache directives to items on a page that are stored in SharePoint Products and Technologies. If those cache directives are included, the browser won't try to download those items again until the cached item expires. As was illustrated with the previous home page example, almost all of the items included in the default home page for the Collaboration Portal site template had a caching directive associated with them, which is why they were not requested on subsequent page hits. For more details about setting up and configuring the BLOB cache, see Optimizing caching for WAN environments later in this paper.

IIS compression

IIS compression is also discussed in more detail in the Optimizing caching for WAN environments section of this paper. As is noted previously, however, the default setting is to use a compression level of 0. You should experiment with the different compression setting levels until you find one that optimizes compression while minimizing impact on the CPUs for the Web servers. In virtually all cases, you can use a compression level greater than 0. It's important to remember, however, that the compression level only applies to dynamic files, for example, .axd and .aspx files.

64-bit hardware

Hardware choices in the farm can affect the latency of requests as well. 32-bit systems have a memory limit of 2 gigabytes (GB) of RAM per application. Although you can extend application support to 3 GB of RAM, SharePoint Products and Technologies do not support using the /3GB switch. Low memory situations can negatively affect the request latency in the following ways:

  • If the amount of memory becomes constrained, it can cause the SharePoint application pool to recycle. That forces the ASP.NET application domain to recycle as well, which can cause a long delay in responding to user requests.

  • Out-of-memory errors can cause the BLOB cache to stop serving content.

By using 64-bit hardware, you can ensure that you can allocate and use enough RAM to prevent these errors.

Web garden settings

Web garden settings can also inadvertently cause the BLOB cache to work inconsistently. Because only one process can acquire the lock necessary to manage the cache, successful use the cache depends on which thread services a request. If a Web garden that does not have the BLOB cache lock services a request, the content it sends in response will not have caching directives associated with it. That increases both the number of requests and the amount of the data sent over the network. Therefore, if you intend to use the BLOB cache, you should not use Web gardens.

Minimize secured items on the page

When a user authenticates to SharePoint Products and Technologies, two things happen. First, the system validates credentials to determine who the user is. Second, the role provider enumerates the list of SharePoint groups to which the user belongs. Each time a page is requested, the role provider is called again to enumerate all of the groups to which the user belongs.

However, this call for group membership can happen multiple times on a single page. For example, the default Collaboration Portal site template page requires two calls to the role provider when you go to the home page—one for the page itself and one for the image on the page. Each image that is stored in a SharePoint library and is on the page will force an additional call to the role provider to verify permissions, even if all of the images are stored in the same image library. That verification occurs whether the images are added as fields on the page—that is, part of the page's content—or whether they are added to the master page for the site.

A site developed for a limited bandwidth or high latency environment should be designed to minimize the number of images used on the page. Many sites use several images as part of the master page to create different visual effects. Because the latency will increase with the increased number of security checks, design sites for these environments by using as few images as possible.

Minimizing the number and size of images

As described in the previous section, you should minimize the number of images in your site. To help with that effort, you can embed multiple images in a single file and then reference individual images in your page. Not only will file download size decrease, but fewer files result in less network traffic. It is more complicated to author pages by using this technique, but in situations where every round trip and file size counts, it can prove to be valuable way to help improve performance. The following figure shows an example of a single image file that contains multiple images.

Six icon images in a row

The following figure shows how this image is subsequently changed to display as individual pictures in a table.

Six icon images in a table

Manipulation of the images was done entirely through style sheet classes. There were two primary classes used in div and img elements in each table cell. Those classes are as follows:

.cluster {

   height:50px;

   position:relative;

   width:50px;

}

.cluster img {

   position:absolute;

}

Each image has a class associated with it based on the identifier (ID) for the image. That style clips the picture and defines an offset from the initial picture in the cluster. Those classes are as follows:

#person {

   border:none;

   clip:rect(0, 49, 49, 0);

}

#keys {

   clip:rect(0, 99, 49, 50);

   left:-50px;

}

#people {

   clip: rect(0, 149, 49, 100);

   left:-100px;

}

#lock {

   clip:rect(0, 199, 49, 150);

   left:-150px;

}

#phone {

   clip:rect(0,249, 49, 200);

   left:-200px;

}

#question {

   clip:rect(0, 299, 49, 250);

   left:-250px;

}

The HTML for the table includes the div and img tags with the appropriate ID values and class names, as follows:

<table border="1">

   <tr>

      <td><div class="cluster"><img id="person" src="Icons50x50.gif" width="300" height="50"/></div></td>

      <td><div class="cluster"><img id="keys" src="Icons50x50.gif" width="300" height="50"/></div></td>

      <td><div class="cluster"><img id="people" src="Icons50x50.gif" width="300" height="50"/></div></td>

   </tr>

   <tr>

      <td><div class="cluster"><img id="lock" src="Icons50x50.gif" width="300" height="50"/></div></td>

      <td><div class="cluster"><img id="phone" src="Icons50x50.gif" width="300" height="50"/></div></td>

      <td><div class="cluster"><img id="question" src="Icons50x50.gif" width="300" height="50"/></div></td>

   </tr>

</table>

Multiple Microsoft Web properties and products now use this technique, including the Microsoft Passport Network and Microsoft Office Outlook Web Access (OWA). The MSN team has conducted some performance tests to analyze the impact of the changes and has seen a load time improvement for the first page of 50 to 75 percent.

There is an important point to consider if you are authoring your pages in Microsoft Office SharePoint Designer 2007. When you create a new page in Office SharePoint Designer 2007, it automatically adds the following XHTML schema markup to the page:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

It then adds this namespace to the HTML element:

<html xmlns="http://www.w3.org/1999/xhtml">

If you use this schema, the images will not line up correctly. You must remove the xmlns attribute from the HTML tag in order to make the images appear as intended.

Delay downloading core.js

The main client script file included with Office SharePoint Server 2007 is called core.js. It is the largest script file at approximately 257 KB uncompressed, and approximately 54 KB compressed. In certain situations, you may be able to delay the downloading of core.js. When you do that, the page renders more quickly because the core.js file does not start downloading until the page has been opened in the browser. That enables the user to view the page and start reading the content sooner. This technique is most useful in an Internet scenario with anonymous users. Conversely, authenticated users need to have core.js downloaded at page load time for such functionality or UI elements as the Site Actions menu.

You can use the following steps to implement this technique.

  1. Create a custom layout page that doesn't use core.js. This layout page will be used with the home page so initial visitors to the site will not have to wait for core.js to download immediately. It needs to be compatible with the existing home page, so the new layout page should be have the same associated content type as the layout page currently in use.

    Note

    By default, in a Collaboration Portal site, the Welcome Page content type is specified.

  2. Add the following tag to the page: <SharePointWebControls:ScriptLink runat="server"/>. This instructs the system to not use core.js unless it is referenced.

  3. Create a custom control to check for authenticated users. The control will be very simple, and essentially contains the following code (shown in C#):

    if (HttpContext.Current.Request.IsAuthenticated)   
        Microsoft.SharePoint.WebControls.ScriptLink.RegisterCore(this.Page, true);
    
  4. On each Web server, put the control in the Global Assembly Cache (GAC), and then add a SafeControl entry for it in the Web.config file for the Web application in which it will be used.

  5. Add the custom control to the custom layout page created in step 1.

  6. Add an IFRAME to the layout page. It should reference a page that has the following content:

    <body>
    <SharePoint:ScriptLink name="core.js" runat="server" />
    <script language="javascript">
     DisableRefreshOnFocus();
    </script>
    </body>
    
  7. Check in the custom layout page and publish it.

  8. Base the home page for the site on the new layout page.

After you perform the preceding steps, test the home page for the site to verify that it works. A first-time anonymous user should not see a reference to core.js in the page source, but it should end up in browser cache.

Additionally, consider the following before employing this technique:

  • The site master page and the system master page must be different; otherwise, all pages in _layouts would not work right.

  • Ensure that the master page doesn't contain any controls that require core.js at load time to work.

  • Ensure that the master page doesn't have any ScriptLink controls that load or reference core.js.

For additional details and sample code, see the Microsoft Enterprise Content Management (ECM) Team Blog (https://go.microsoft.com/fwlink/?LinkId=106008&clcid=0x409).

Optimizing list view pages

The Microsoft Services team has worked to quantify and improve the performance of list view page rendering times. A list view page is the AllItems.aspx page that is used by each list and library to enable browsing of content. The rendering time of that page can vary widely based on how many columns are visible in the view and what format they are. For example, display options and enabling presence icons can greatly affect rendering time. Group Collapsed took significantly longer than Group Expanded to render, and both of those were slower than no grouping at all.

These sorts of nuances are why it's important to carefully consider how views are constructed in list view pages, especially over slow network links. When working with lists that contain a lot of data, it's important to carefully tailor all views, especially the default view. In general, you can speed up the rendering time of list view pages by:

  • Showing fewer columns.

  • Excluding any columns that include presences information.

  • Using a link (but no edit menu) to view the item details.

The following table describes customizations that reduce the time required for a view to render.

Item Description

View type

Create a view as a datasheet view instead of a standard view.

View: Item Limit

Anything over 1,000 will likely render slowly. Over a slow connection it's important to experiment to find the right balance between the quantity of data shown at a time and the number of round trips necessary to view all the data. The more rows that are shown at a time, the fewer round trips, but larger pages.

View: Filter

Use [Today] and [Me] to filter items by freshness or assignment. Use Status fields to show only active items in default views.

View: Columns

Use as few columns as necessary. Create a default view with few columns that allows high-level browsing after which people can drill down.

The following table describes customizations that will increase the time required for a view to render. Each additional column increases rendering time by a slight amount: up to a half-second per column over a fast network connection for a list of 1,000 items. Some columns cost more, as noted in the table.

Item Description

Group By

Grouping adds HTML and JScript, slowing down rendering for large lists. Making all groups collapsed by default actually increases rendering time further because of additional operations on the browser object model.

Column – title linked to item with edit menu

The option "linked to item with edit menu" takes the longest; the similar option "linked to item" does not increase rendering time noticeably.

Column – Lookup User with Presence

Adding a user lookup field with presence adds HTML for each link to open the presence menu. In addition it also uses bandwidth to determine presence availability.

The following table describes customizations that have a relatively small impact on the time required for a view to render.

Item Description

Sort, Totals

Applying multiple sorts and totals will increase the rendering time by a noticeable amount, but each one typically costs less than a second for a list of 1,000 items.

Optimizing caching for WAN environments

There are a number of different caching options in Office SharePoint Server 2007. A number of them are aimed at improving throughput in the rendering pipeline—from the time a request is received at the server until the time a response begins to stream back to the client computer. Although this is an important aspect of your overall site performance, this section focuses on caching as it relates to the following:

  • The role of server configuration on client caching.

  • Controlling the size of items that are transmitted over the network from the server to the client.

BLOB cache

The BLOB cache is a mechanism that is only available with Office SharePoint Server 2007 publishing features. This makes it an ideal candidate for corporate portal sites that are based on the Collaboration Portal site template and Internet-facing sites that are based on the Publishing Portal site template. The BLOB cache enables you to configure caching directives that are associated with items served from publishing site lists, for example, the Pages library and Site Collection Images. When the browser on the client computer encounters a caching directive, it detects that the item it is retrieving can be saved locally and does not need to be requested again until the caching directive expires. In a geographically distributed environment, this is critically important because it reduces the number of items requested and sent over the network.

When the BLOB cache in Office SharePoint Server 2007 is turned on, a couple of different things happen. First, each time a cacheable item is requested, Office SharePoint Server 2007 searches the hard disk drive of the Web server that received the request to see if a copy exists locally. If it does, the file is streamed directly from the local disk to the user. If it isn't on the local disk yet, a copy of the item is made from the SQL database where it is stored, and then the item is sent to the user making the request. From that point forward, all requests for the item can be served directly from the Web server until the item's cacheability value indicates that it has expired. That results in better performance in the server farm by reducing contention on the database server.

The other thing it does is append a cacheability header to the item when the item is sent to the client. This header instructs the browser how long the item should be cached. For example, if a picture had a cacheability value of three days, the browser uses the copy of the image it has in its local cache if the picture is requested again within the next three days; it does not request it from the server again.

When testing your site to see what items are cached and how items are being cached, you can use Fiddler (http://www.fiddlertool.com). The following screenshot shows a Fiddler capture on a simple SharePoint site that is used for publishing. The site was created by using the default Collaboration Portal site template. Some additional text content was added to the page, and several images were added to the master page.

Fiddler tool results

There are several important pieces of information contained in the Fiddler application.

  • The # column on the left indicates that there were a total of 44 HTTP requests made from the browser to the server to render the page.

  • The Result column shows the HTTP result code that was returned from the request for the item; a 200 result means the item was successfully retrieved.

  • The URL column indicates what specific item was being requested.

  • The Body column indicates the size of each item.

  • The Caching column shows the caching directive that was associated with each item. The data in the Caching column shows that several items have a caching directive associated with them; that is, they have a max-age attribute that is greater than 0. Caching directives are expressed in seconds. This mean that for the page illustrated several items are configured to be cached for 365 days (60 seconds in a minute, 60 minutes in an hour, 24 hours in a day = 60x60x24 = 86,400x365 = 31,536,000).

Notice that the items with that cache directive all reside in the _layouts directory. The reason they have that cache setting is because of the way the _layouts/images virtual directory is configured in IIS. When you create a new Web application, Office SharePoint Server 2007 automatically creates several virtual directories that map to folders on the Web server physical disks. When it creates the _layouts/images virtual directory, it adds a caching directive that applies to the entire directory. The following screenshot shows the configuration for the directory in the IIS Manager snap-in.

IIS Manager properties for image folder

Because those items all have a non-zero caching directive associated with them, the next time the page is requested, the browser uses the copy of the item from its local browser cache instead of requesting it from the server again. The following screenshot shows a snapshot of Fiddler when the page is requested a second time.

Fiddler tool results

As the Fiddler data shows, the number of items requested has reduced significantly—from 44 to 11. An important point to note is that the number of requests made may vary depending on how the page is requested. If you use the Refresh button in the browser, all of the items will likely be requested again, whether a local cached version of the item exists or not. Conversely, if the page is requested by navigating to it by using a link or shortcut, only the uncached items are requested.

The other thing shown in the Fiddler data is that the browser is making a request to the server for the other images on the master page that it already has in its local cache; the 304 response code indicates this. It means that the browser has made a conditional request for an item. The 304 response means that the version on the server has not been modified from the version on the client, so it doesn't need to be downloaded again. Even though the file is not downloaded across the network, it still has generated a round trip to the server to determine that the local copy is current. In a geographically dispersed environment, server round trips are costly, so the goal is to reduce them as much as possible. If a non-zero caching directive is added to each of the remaining items (other than the page, which is always returned by the server), then this goal can be achieved. The BLOB cache feature is what adds this caching directive.

You configure the BLOB cache by using the Web.config file for the Web application in which the cache will be used. Open the Web.config file in a text editor such as Notepad and search for the BlobCache entry. By default it will be:

<BlobCache location="C:\blobcache" path="\.(gif|jpg|png|css|js)$ " maxSize="10" enabled="false"/>

The attributes used in the BlobCache element have the following meanings:

  • location    Refers to the location on the Web server hard disk drive where cached items will be stored.

  • path   A regex expression of the types of files that should be cached.

  • maxSize **   **The size, in GB, that the cache is allowed to use.

  • enabled    Set to true to enable the BLOB cache.

The following additional attribute—not included by default—is necessary to set a caching expiration value on individual items:

  • max-age   The amount of time in seconds that items should be cached on the client computer.

By setting the max-age attribute to a non-zero value, cacheable items have a cache expiration value associated with them so that a browser no longer needs to download the item, or even verify if it has the most current version. For example, assume you wanted to enable caching and allocate up to 100 MB on the Web server to store items. They should expire once a day, and in addition to the predefined types that are cached, .htc files should be cached as well. To support those requirements, specify the following BlobCache attributes:

<BlobCache location="C:\blobcache" path="\.(gif|jpg|png|css|js|htc)$ " maxSize="100" max-age="86400" enabled="true"/>

Note that this change to the Web.config file needs to be made on every Web server in the farm. In most cases, the BLOB cache will begin working immediately, but it's safest to use the iisreset command when you implement the changes. The following screenshot shows Fiddler data for the same page request shown previously, only with the BLOB cache enabled as described.

Fiddler tool results

Notice that all of the items in the /SiteCollectionImages library now have an HTTP status code of 200, indicating that they have been successfully downloaded. In addition, they all now have a Caching directive associated with them that specifies they will not expire for one day (8,400 seconds). If the page is requested again, Fiddler shows that none of the images are requested again; the total number of requests to service that page has thus gone from 44 down to three, and two of those are just from the NTLM authentication negotiation that occurs between the Web server and client application. The following figure shows Fiddler data when the page is requested again.

Fiddler tool results

Additionally, consider the following when working with the BLOB cache:

  • It requires some additional effort to configure because you have to update the Web.config file on each Web server. However, the benefits are worth the effort.

  • Survey your site contents and determine if there are any other file types that should also be served from cache. A good example is .htc files. Because .htc files are used in most publishing sites, you should add that file type to the list of file types being cached.

  • The BLOB cache only works on items stored in SharePoint libraries; it cannot be used to cache content from other sources.

  • Some lists don't work by default for anonymous users. If there are anonymous users accessing the site, permissions need to be manually configured for the following lists in order to have items within them cached:

    • Master Page Gallery

    • Style Library

There are two other configuration options to be aware of when working with the BLOB cache. The first has to do with clearing out the BLOB cache. If the cache needs to be cleared out for particular site, navigate to that site collection and click the Site Actions…Site Settings…Modify All Site Settings menu. In the list of Site Collection Administration tasks. Click the Site collection object cache link. In the Disk Based Cache Reset section, check the box that says Force this server to reset its disk-based cache and then click the OK button.

If you are considering using Web gardens in a SharePoint farm, you also need to be aware that doing so will result in the BLOB cache to operating in a manner that seems inconsistent. When more than one Web garden is configured in a server farm, it poses a problem because only one of them can acquire the lock necessary to manage the BLOB cache. As a result, it may appear that the BLOB cache is only working intermittently. Successful use of the BLOB cache instead becomes dependent upon which thread services a request.

IIS compression

Unlike previous versions of SharePoint Products and Technologies, IIS compression is now automatically turned on when you install SharePoint Products and Technologies. After a site has been hit by a few users, you can verify that compression is working by viewing the %WINDIR%\IIS Temporary Compressed Files directory on a Web server. It should contain multiple files, which indicates that static files have been requested and IIS has compressed a copy of them and stored them on the local drive. When that file is requested again, whether it's the same user or not, the compressed version of the file is served directly from this folder. Dynamic files can be compressed as well, but they are always compressed on the fly; copies are not kept on the local Web server.

Compression can result in significant bandwidth savings. For example, the core.js file is included on every SharePoint page. When it's uncompressed, it is 257 KB; after compression, the file is only 54 KB without performing additional tuning to IIS compression. Core.js should be cached after the user first visits the site, but this example illustrates how significantly compression can help in low-bandwidth scenarios.

When SharePoint Products and Technologies are installed, setup configures IIS to compress the static file types .htm, .html and .txt; it compresses the dynamic file types .asp and .exe. By surveying the file types that are widely used in your implementation, it may be advantageous to compress additional file types. For example, it probably makes sense to also compress the static file types .css and .js; it may also make sense to compress the dynamic file types .axd and .aspx.

To add a static or dynamic file type to the list of types that will be compressed, use the adsutil.vbs file that is in the %SystemDrive%\Inetpub\AdminScripts folder by default on each Web server. The following example shows Microsoft Windows Server 2003, including css and js files in the list of compressed static file types:

  • CSCRIPT.EXE ADSUTIL.VBS SET W3Svc/Filters/Compression/DEFLATE/HcFileExtensions "htm" "html" "txt" "css" "js"

  • CSCRIPT.EXE ADSUTIL.VBS SET W3Svc/Filters/Compression/GZIP/HcFileExtensions "htm" "html" "txt" "css" "js"

The following example shows the inclusion of .aspx and .axd files in the list of dynamic file types:

  • CSCRIPT.EXE ADSUTIL.VBS SET W3Svc/Filters/Compression/DEFLATE/HcScriptFileExtensions "asp" "exe" "axd" "aspx"

  • CSCRIPT.EXE ADSUTIL.VBS SET W3Svc/Filters/Compression/GZIP/HcScriptFileExtensions "asp" "exe" "axd" "aspx"

When changing the file types that are compressed, ensure that you only include files that are well-suited to being compressed. For example, .jpg files are not a good candidate for compression because the file format is inherently compressed already. Also, 2007 Microsoft Office system file types such as docx, xlsx and pptx are not a good choice for compression because the files are not served directly from the server; instead, they are routed through the different ISAPI filters that are used to manage the rich integrated end user experience for Microsoft Office content. In addition, in the 2007 Microsoft Office system, these file types are inherently compressed.

In addition to controlling the types of files that are compressed, it's also possible to control the level of compression used on dynamic file types. The amount of compression used is based on a scale of 0 to 10. The higher the compression level, the smaller files will be. However, the tradeoff is that higher compression levels also consume more CPU, so you will use greater CPU utilization on compression to create smaller files. When IIS compression is enabled, it configures it to use a compression level of 0. Historically, the ideal compression level with SharePoint Products and Technologies has been 9. To change the compression level, use the adsutil.vbs script file previously described in this article. The following example shows changing the compression level to 9.

  • CSCRIPT.EXE ADSUTIL.VBS SET W3Svc/Filters/Compression/GZIP/HcDynamicCompressionLevel "9"

  • CSCRIPT.EXE ADSUTIL.VBS SET W3Svc/Filters/Compression/DEFLATE/HcDynamicCompressionLevel "9"

    Note

    This setting does not change the compression of static files.

For more information, see Using HTTP Compression (IIS 6.0) (https://go.microsoft.com/fwlink/?LinkId=108705&clcid=0x409).

Combining the BLOB cache with IIS compression

By enabling the BLOB cache, you can significantly reduce the number of server round trips and files that are sent across the network when users browse a SharePoint site. When compression is used in conjunction with the BLOB cache it also reduces the amount of traffic for those files that sent to clients. Combining the two can greatly reduce the amount network traffic and contention for network and server resources.

Download this book

This topic is included in the following downloadable book for easier reading and printing:

See the full list of available books at Downloadable content for Office SharePoint Server 2007.

See Also

Concepts

Optimizing custom Web parts for the WAN
Extending Office SharePoint Server global solutions with Office Outlook 2007 and Office Groove software
Supported global solutions for Office SharePoint Server