Public Folder Replication

 

The root of a public folder tree, as viewed in Exchange System Manager, is referred to as the top level hierarchy. In Exchange Server 2003, there can be several top level hierarchies. The public folder top level hierarchy (also referred to as the MAPI top level hierarchy) is just one of many public folder trees. The MAPI top level hierarchy in Exchange Server 2003 performs the same tasks that it performed in previous versions of Exchange and also replicates an Exchange 2000 or Exchange 5.5 public folder tree. Exchange Server 2003 also supports additional trees, commonly referred to as application top level hierarchies. Each top level hierarchy has a directory entry. The entry contains a back link to the distinguished names of all the public stores in the top level hierarchy. The MAPI top level hierarchy is secured in the directory under the First Administrative Group in the Exchange organization.

A single server can host only one public folder store database in a top level hierarchy. For active/active clusters, this means that only a single instance of a top level hierarchy database can exist across both Exchange virtual servers (EVSs) due to the possibility of both EVSs running on the same physical node. Exchange Server 2003 Service Pack 1 now supports hosting more than one instance of a public folder tree in an active/passive cluster, because a single physical node cannot host more than one EVS.

Public Folder Database Trees

The public folder database is divided into two trees: the IPM_Subtree and the non-IPM_Subtree. The IPM_Subtree contains folders visible to users and clients. For example, a folder created by Microsoft Outlook exists in the IPM_Subtree. A folder in the IPM_Subtree can be searched, accessed directly by users, and used to store user data. The non-IPM_Subtree contains folders not directly accessible by users. The non-IPM_Subtree folders replicate in exactly the same way as the IPM_Subtree folders, but cannot be manipulated directly by users.

The non-IPM_Subtree includes the following folders:

  • Site Folders   These are folders, such as SCHEDULE+ FREE BUSY, Events registry, MAPI Forms, and Offline Address List.

  • Restrictions   These folders are not replicated.

  • Views   These folders are not replicated.

Site folders are visible when viewing system folders in Exchange System Manager. Site folders replicate in the same way as ordinary folders, and their replication lists can be modified in the same way as ordinary public folders. The first server running Exchange Server 2003 in an administrative group holds copies of offline address lists, free/busy data, and replicas of other site folders. The location of these folders (that is, the public folder store that hosts these folders) can be changed through Exchange System Manager. Each administrative group has a site folder server (the first server in the site), which is stored as an attribute of the administrative group's directory entry. This determines which server is responsible for making sure that site folders exist.

Replication Overview

Public folder replication is the transfer of public folder data between public folder stores in the same top level hierarchy, using an e-mail-based replication engine. The process is the same for MAPI top level hierarchies and application top level hierarchies. The folder hierarchy is replicated through hierarchy replication messages and the content of folders is replicated through content replication messages between replicas of individual folders. In addition, there are backfill requests and response messages, and status messages and status request messages, which keep replication between stores synchronized.

Note

Internally the store addresses folders by a Folder ID (FID), which is a hex ID; for example, 1-2A45. An FID is a row in the folders table in the public folder store. Similarly, messages are referenced by a Message ID (MID), which is a row in the MsgFolder table. Hierarchy replication messages, for example, are a special type of content replication message for a folder with the ID 1-1.

Replication uses standard transports to send replication messages to other public folder stores. If an update must go to multiple public folder stores, then a single replication message is generated, addressed to the multiple public folder stores (in other words, to the replica list for the folder, because for a hierarchy, this is all that the public folder stores in the top level). The SMTP transport engine must categorize and bifurcate the message to deliver it to each individual public folder store. For more information about message categorization and bifurcation, see SMTP Transport Architecture.

Public Folder replication is e-mail based. Replication messages are e-mail messages sent between the public folder stores in each top level hierarchy. Therefore, there must be an e-mail path between the public folder stores to enable replication.

Folders replicate by sending e-mail between public folder stores. Therefore, public folder stores require e-mail addresses (added by the recipient update service).

Packing and Unpacking

The process of putting replication data in the replication message ready to be sent is named packing. The process of retrieving replication data from the replication message is named unpacking. Multiple hierarchy updates or multiple content updates for the same folder can be packed in a single replication message. This reduces mail traffic, because a single message can contain multiple updates (in other words, there is less overhead of message envelope and message headers). Hierarchy updates cannot be packed in the same replication message as content updates, and content updates for different folders cannot be packed in the same replication message.

Change Numbers

All updates (create, delete, and modify) are assigned change numbers. Change numbers are used by the replication engine to track updates. Every modification to a folder is given a change number. When updates are replicated to another server, the change numbers for the specific changes are included with the update. The change numbers are then used by the receiving server to determine whether this is a new change. The replication message also carries a copy of the complete set of change numbers that exist in the folder on the sending server, so that the receiving server can determine whether it is missing any data. A set of change numbers is referred to as a change numbers set (CNset).

Replication Message Types

There are six replication message types. The six types are hierarchy replication messages (the content replication of FID 1-1), content replication messages (replicating content between individual folder replicas), backfill request messages, backfill response messages, status messages, and status request messages. Each of these message types is described in the following table.

Replication message types

Message Type Description When Used

Hierarchy replication messages (0x2)

A hierarchy replication message is a replication message between replicas of a special folder with the ID 1-1 (FID 1-1).

Replicates hierarchy changes from one public folder store to all other public folder stores in the same top level hierarchy.

Content replication messages (0x4)

Content replication messages replicate content updates between replicas of individual folders. A public folder store only sends a content replication to another public folder store that holds a replica of the folder.

Replicates content changes from one replica to all other content replicas of that folder.

Backfill request messages (0x8)

Backfilling is the process by which public folder stores that miss replication updates can request a re-send of missing data. There are two parts to the backfill process: backfill request and backfill response. When a public folder store determines that it is not synchronized, it issues a backfill request by detecting a discrepancy in a folder's CNSet compared to the CNSet in some recently received replication mail. This is accomplished either through replication, or through status messages sent by other public folder stores.

Backfill request messages request missing data (in CNSets) from another public folder store (both hierarchy and content). Backfill request messages are sent only to other content replicas of the folder (all top level hierarchy members, if this is for the hierarchy).

Backfill response messages (0x80000002 or 0x80000004)

Backfill responses are structurally identical to their regular counterparts, but are sent in response to a backfill request and are addressed only to the requestor. They contain the changes specifically requested. Multiple responses might be sent for a single request, if all requested data is too large for a single response. Also, a response might contain no data at all.

Backfill response messages send missing data to a public folder store that requested missed updates (CNSets).

Status messages (0x10)

A status message is sent in response to a status request. It contains the complete CNSet of owned changes on this server. This set does not necessarily represent all the changes that actually occurred, because all changes might not have replicated to this specific public folder store.

Prior to Exchange 2000 Server, status messages for all folders in the public folder store were broadcast every 24 hours. This resulted in network overload. Therefore, this periodic broadcast was removed in Exchange 2000 Server.

Sends the current CNSets of a folder to another replica of that folder. Used for hierarchy (replicas of folder 1-1) and content (specific content replicas).

Status request messages (0x20)

Sends the folder's current CNSet to all other replicas. It simultaneously requests that some subset of those replicas return their own CNSet. This response comes as a status message. The requested server does not respond if the requesting server's CNSet is not a strict subset of the requested server's set.

The Exchange store sends a status request message in the following situations:

  • An existing replica of a public folder might have missed replication messages or might have been restored from an outdated backup. A status request message is sent by one public folder store to another public folder store to determine if any changes are missing locally.

  • A new replica of a public folder was added to a public folder store. A new replica of a folder generates a status request for the content.

  • A new public folder store was created and associated with a particular public folder hierarchy. A new public folder store generates a status request for the hierarchy, because a new hierarchy folder (FID 1-1) was created.

  • An existing replica of a public folder was removed from a public folder store. A removed public folder also generates a status request because the content of the hierarchy folder (FID 1-1) was changed.

Replication Process

Public folder stores send replication messages to each other through e-mail. Therefore, there must be an e-mail path between the public folder stores for replication messages to be received. A thread runs continually in the Store.exe process, which polls for replication events. Replication events occur at specific time intervals. When this timed event occurs, the replication thread generates a new thread, which performs the specified replication task. The following are the default replication time intervals:

  • Hierarchy replication events occur every five minutes.

  • Content replication events occur every 15 minutes.

  • A status message is broadcast 24 hours after the last regular replication broadcast.

Hierarchy Replication

A hierarchy replication message is generated whenever the hierarchy is modified. The following are examples of hierarchy modification:

  • Creating, deleting, or renaming a folder

  • Modifying folder permissions or descriptions

  • Changing the replication schedule and priority settings

  • Adding content to or removing content from a folder

  • Modifying replica lists

  • Moving the folder in the hierarchy

The following figure illustrates the hierarchy replication process.

Hierarchy replication process

c205e2b8-7e5a-44b4-bfe2-48040152948d

In this illustration, Folder 1, Folder 2, and Folder 3 are added to Server A. Server A then replicates the hierarchy changes to Server B, so that Server B knows about these public folders in the hierarchy. Users on Server B can now navigate through the hierarchy and select any one of these folders, but only Server A has the contents of the public folders. When a client attempts to access Folder 1, Folder 2, or Folder 3, Server B redirects the client to Server A. Server A returns the content to the client, and the client can display the content. The redirection process is transparent to the user.

Content Replication

Folder content replicates between individual replicas of folders. When folder content is modified, the change is tracked with change numbers. When the replication interval is reached, the changes are replicated to all other public folder stores that have a replica of the folder.

The following figureillustrates the content replication process.

Content replication process

606cb78f-bd85-45b1-820f-84cf31c23961

In this illustration, Item 1 is posted to a folder on Server A, which has a replica on Server B. The public folder store on Server A replicates the change to the public folder store on Server B. Similarly, Items 2 and 3 are posted and replicated.

Backfilling

Folders remain synchronized throughout the backfill process. Folders backfill only when they are missing content. Therefore, for a folder to issue a backfill request, it must first determine that it missed an update. This is accomplished by determining a missing sequence in the folder CNSets for individual folders.

Content backfill and hierarchy backfill both work in the same way. A hierarchy backfill is issued when there is a gap in the CNSets for folder 1-1. A content backfill is requested for gaps in any other folder.

The backfill process can take a long time, especially if a public folder store is down and missed the original replication update and the subsequent status message. It might not realize that it is missing content until further replication messages arrive.

Backfill Array

The backfill array is used to store pending backfill requests. When the public folder store determines that a folder is out of sync, it writes an entry to the backfill array. This entry is a pending request for the missing data from another replica of the folder. The entry stays in the backfill array until it times out, at which point a backfill request is issued. The default backfill timeouts are listed in the following table.

Default backfill timeouts

Backfills Intra Site Inter Site

Initial backfill

6 hours

12 hours

First backfill retry

12 hours

24 hours

Subsequent backfill retries

24 hours

48 hours

If the first backfill attempt is unanswered, subsequent backfill attempts wait longer before they are sent. These times are extended to prevent unnecessary backfilling. The replication message might be in transit, delayed, or waiting for a connector's schedule. If the backfill timeout is too short, public folder stores start issuing backfill requests for messages already in transit.

Replication Status

There are two categories of status messages: status requests, and status messages. A status request message is sent from one public folder store to another to request the other public folder store's current state of a particular folder. A status message is sent from one public folder store to another to indicate the current state of a particular folder on the sending server. If the status message indicates that the sending public folder store has more up-to-date information about the folder, then the receiving store writes an entry to its backfill array to request a backfill. If the CNSets are shown to be equal (or the receiving server is more recent) no action is taken.

A public folder store generates a status message under the following two circumstances:

  • In response to a status request   If a public folder store receives a status request from another public folder store, it returns a status message. This occurs when the replica list of folders is changed (for example, when a folder is removed from a server).

  • 24 hours after the last local change to a folder   This is a significant change from previous versions of Exchange. When a change is made to a folder, an expiry time for a status message is set on that folder. If another change is made to that folder, the expiry time is reset to 24 hours.

After the expiry time is reached, a status message is generated for that folder. After this occurs, the expiry time is cleared and no other status messages are generated for that folder unless another change is made, which resets the expiry time to 24 hours.

Replication Status Thread

The thread that determines whether a status message should be sent runs by default at 12:15 A.M. and 12:15 P.M., Greenwich mean time. When it runs, it checks to see if the timeout has been reached for any folders, in which case it broadcasts a status message. Therefore, it could take up to 36 hours of zero changes to generate a status message.

The replication status thread timing can be altered with the following registry settings:

  • Replication Send Status Timeout

  • Replication Send Status Alignment

  • Replication Send Status Alignment Skew

With the reduced number of status messages sent by Exchange Server 2003, it should not be necessary to modify the default values. For more information on these settings, see Microsoft Knowledge Base article 203170, "XADM: Controlling Public Folder Hierarchy Status Messages."

Replication Status Requests

A status request occurs when a public folder store requires a remote server's status for a particular folder. Depending on the circumstances, a status request might trigger a return status message.

A status request is generated under the following circumstances:

  • When a new public store is mounted for the first time   When a public folder store is mounted for the first time, it generates a hierarchy status request for folder 1-1. (No content replicas can be assigned to this public folder store, so the only thing that it is missing is the hierarchy). This triggers another public folder store to send a status message for the hierarchy, which results in the addition of several entries in the new server's backfill array. Shortly thereafter, backfill requests for the missing changes are sent, causing other servers to send replication messages containing the missing updates.

  • When the replica list of a folder is changed   When the replica list of a folder changes, a status request message is generated. Adding a new replica, deleting a replica, or a temporary replica re-home all generate status requests.

  • When a public store database is restored from backup   When a restored database is out of the replication loop for an indeterminate amount of time, a status request for the hierarchy and all content replicas in the hierarchy is broadcast. This request accomplishes two things. All other servers get a revised picture of this server's change number information, and a mass update of change number information occurs on the newly restored database. This leads to the filing of backfill entries, and ultimately to the sending of backfill requests.

Modifying the Replica List

Modifying the replica list is a hierarchy change. However, because the replica list is changing (folder replicas are either being created or removed from a server), status messages and status requests are also used.

Adding a Replica

When a new replica is added to a folder, the following steps occur:

  1. A hierarchy replication message is sent to replicate the change in the folder's replica list.

  2. The server that was newly added as a replica sends a status request message to all other content replica servers.

  3. Because the newly added server has an empty CNSet, it is a strict subset of all other content replica's CNSets, so they all respond with a status message.

  4. Backfill entries are filed, backfill requests are sent to appropriate servers, and the servers respond with content.

  5. At any point after Step 1, other content replicas might send regular content replication broadcasts to the new replica server.

Steps 1 and 2 might not always occur in the same order, depending in which public folder store the original change was made. If the administrator makes the change on a server that has a content replica, then the steps occur in the above order. If the administrator makes the change on the server that hosts the new replica, Steps 1 and 2 might occur in the reverse order.

Deleting a Replica

When a replica is removed from a server, the folder is not deleted immediately. Instead, the folder is put in a delete pending state. When a folder is in a delete pending state, it cannot be viewed by a client or administered. (Exchange System Manager does not show the folder on the list of folders hosted on the public folder store.)

The delete pending state exists so that other replicas can replicate any missing data from it. After the delete pending folder receives status messages from all other replicas, indicating that the folders are synchronized, the deleted replica is removed. This process ensures that if you change the sole replica of a folder from one server to another server, no content is lost.

When deleting a replica, the following steps occur:

  1. The folder is removed from the replica list.

  2. A hierarchy message is replicated indicating the change in the folder's status (for example, Active -> Delete Pending).

  3. The server that hosts the Delete Pending folder sends a status request, which requires a response.

  4. A server with a replica responds to the status request with a status message. If the status message indicates that CNSets are at least as current as the replica being deleted, the public folder store proceeds to Step 5. Otherwise, it continues to send status requests until it receives the correct response.

  5. The folder being deleted has its state changed from delete pending to delete now, and the folder is deleted.

Replication State Tables

Every replicated folder (including the hierarchy) has a set of rows in the replication state table, which holds replication state information about each folder. Each row in a folder's set of rows represents a replica of that folder. The row for the local server contains, among other things, the change number last broadcast, the locally owned CNSet, the backfill array, the time the next status broadcast should occur, and other data. The rows for other replicas contain the CNSet information that the local server has last received from each other server (one per row), the average transmission time for replication e-mail from each other server, the last time a backfill request was sent to each other server, and more.

Every time a replication message is sent, the CNSet from the replication state table for the local replica of the folder is included with the message.

The replication state tables themselves do not replicate. Replication is generated by the data from the CNSets. This is how public folder stores determine what data other replicas of a folder contain.

Note

Each server tracks updates from other servers using a replication ID (ReplID). ReplIDs are calculated locally. Therefore, public folder stores do not have the same ReplIDs across multiple servers.

Default Replication Event Schedule

The following table illustrates some of the more common default timeouts associated with replication events. The main replication task thread generates additional worker threads to handle replication tasks when these default timeouts are reached. If there is nothing to replicate, the thread simply exits, and no replication message is generated.

Default replication event times

Replication Event Default Timeout Comments

Replication Expiry

24 hours

How often folders are checked for expiry.

Replication Send Always

15 minutes

This is the default Replicate Always value. This is how often the store checks to see whether it needs to replicate content. This value can be adjusted using Exchange System Manager.

Replication Send Folder Tree

5 minutes

This is how often the public folder store checks to see whether a hierarchy replication message needs to be sent.

Replication Send Status Timeout

24 hours

This is how often the public folder store checks to see whether a status message for a folder should be sent.

Replication Timeout

5 minutes

This is how often the public folder store checks to see if any backfill entries have timed out.

Replication New Replica Backfill Request Delay

15 minutes

This is the time delay used before sending a backfill request for a new folder replica when the data is available on the same site.

Replication Short Backfill Request Delay

6 hours

This is the time delay used before sending a backfill request when the data is available on the same site.

Replication Long Backfill Request Delay

12 hours

This is the time delay used before sending a backfill request when the data is not available on the same site.

Replication Short Backfill Request Timeout

12 hours

This is the timeout value used when retrying to send a backfill request when the data is available on the same site.

Replication Long Backfill Request Timeout

24 hours

This is the timeout value used when retrying to send a backfill request when the data is not available on the same site

Replication Short Backfill Request Timeout Retry

24 hours

This is the timeout value used when sending a backfill request when the data is available on the same site and when this is a retry of a previous backfill request.

Replication Long Backfill Request Timeout Retry

48 hours

This is the timeout value used when sending a backfill request when the data is not available on the same site and when this is a retry of a previous backfill request.

Default Replication Values

The following table illustrates some of the other default values used in public folder replication.

Default replication values

Description Value Comments

Replication Folder Count Limit

20

Maximum number of folders to pack in a hierarchy replication message.

Replication Deleted Folder Count Limit

500

Maximum number of folder deletes to pack in a hierarchy replication message.

Replication Message Count Limit

100

Maximum number of messages to pack in a content replication message.

Replication Message Size Limit

300 KB

Maximum replication message size. This value can be adjusted using Exchange System Manager. If necessary, a single replication message might exceed the limit. This value represents the size at which the packing function should stop packing. If a single post in a folder exceeds this limit, it is sent alone, in its entirety.