Plan content deployment (SharePoint Server 2010)
Updated: August 12, 2010
Content deployment is a feature of Microsoft SharePoint Server 2010 that you can use to copy content from a source site collection to a destination site collection. This article contains general guidance about how to plan to use content deployment with your SharePoint Server 2010 sites. It does not describe the purpose and function of content deployment, explain content deployment paths and jobs, or explain the security options when you deploy content. This article does not explain how the content deployment process works, nor does it explain how to set up and configure content deployment. For more information, see Content deployment overview (SharePoint Server 2010).
In this article:
About planning content deployment
The planning process that is described in this article starts with helping you determine whether to use content deployment with your SharePoint Server 2010 solution. The remainder of the article describes the steps that are required to plan a content deployment solution: deciding how many server farms are necessary, planning the export and import servers, planning the content deployment paths and jobs, and special considerations for large jobs. You can record this information in the worksheet that is referenced in the Content deployment planning worksheet section.
Determine whether to use content deployment
Although content deployment can be useful for copying content from one site collection to another, it is not a requirement for every scenario. The following list contains reasons for why you might want to use content deployment for your solution:
The farm topologies are completely different. A common scenario is one in which there are authors publishing content from an internal server farm to an external server farm. The topologies of the server farms can be completely different. However, the content of the sites to be published is the same.
The servers require specific performance tuning to optimize performance. If you have a server environment where both authors and readers are viewing content, you can separately configure the object and output caches on the different site collections based on the purpose of the site or user role.
There are security concerns about content that is deployed to the destination farm. If you do not want users to have separate accounts on the production server, and you do not want to publish by using only approval policies, content deployment lets you restrict access to the production server.
Before you implement a content deployment solution, you should carefully consider whether content deployment is really necessary. The following list contains alternatives to using content deployment:
Author on production using an extended Web application If you have a single-farm environment, you can choose to allow users to author content directly on the production farm and use the publishing process to make content available to readers. By using an extended Web application, you have a separate IIS Web site that uses a shared content database to expose the same content to different sets of users. This is typically used for extranet deployments in which different users access content by using different domains. For more information, see Extend a Web application (SharePoint Server 2010).
Create a custom solution You can use the Microsoft.SharePoint.Deployment.SPExport and Microsoft.SharePoint.Deployment.SPImport namespaces from the SharePoint Server 2010 API to develop a custom solution to meet your needs. For more information, see How to: Customize Content Deployment for Disconnected Scenarios.
Use backup and restore You can use backup and restore to back up a site collection from one location and restore it to another location. For more information, see Back up a site collection in SharePoint Server 2010 and Restore a site collection in SharePoint Server 2010.
If you decide that using content deployment in SharePoint Server 2010 is right for your solution, continue reading this article.
Determine how many server farms you need
A typical content deployment scenario includes two separate server farms: a source server farm that is used for authoring, and a destination server farm that is used for production. You can also use content deployment to copy content between two separate site collections within the same server farm, or you can use a three-tier server farm that contains a server for authoring, one for staging and quality assurance, and one for production. If you will be using content deployment, you should also decide how many server farms are necessary for your solution. For more information about topologies for content deployment, see Design content deployment topology (SharePoint Server 2010)
Plan the export and import servers
After you have decided on a topology for your server farm, you must decide which servers will be the export and import servers. These are the servers in the server farm that are used to run the content deployment jobs. They do not have to be the same as the source or destination servers. However, the servers that are designated as export and import servers must have the Central Administration Web site installed. Decide which servers will be configured to either send or receive content deployment jobs and to record your decisions.
In the content deployment planning worksheet, record each server farm in your content deployment topology, and note its purpose. For each server farm, provide the URLs of the export server, the import server, or both. Also record the Active Directory domain that is used by the farm.
Plan content deployment paths
A content deployment path defines a source site collection from which content deployment can start and a destination site collection to which content is deployed. A path can only be associated with one site collection. To plan the content deployment paths that are needed for your solution, decide which site collections will be deployed and define the source and destination for each path. For more information about paths, see Content deployment overview (SharePoint Server 2010).
If you will be using a three-stage farm topology, you must also plan for how content will be deployed across the farms. In general, you should reduce the number of “hops” the content makes as it moves from authoring to staging and then to production. For example, if you want to test content on the staging farm before you push it to production, you can deploy content from the authoring farm to the staging farm first, and then deploy content from the authoring farm to the production farm after the content has been verified. This means that only the authoring farm is responsible for deploying content to all other farms in the environment. Although it is possible to deploy content from authoring to staging, and then from staging to production, it is not necessary to use this approach. When you design content deployment paths for a three-stage farm topology, you must also carefully plan the scheduling of the jobs that will deploy the content to the other farms in the environment. For more information about content deployment topologies, see Design content deployment topology (SharePoint Server 2010).
Record each path in the content deployment planning worksheet. For each path, enter the source and destination Web applications and site collections. Also record how much security information to deploy along the path: All, Roles only, or None.
Plan job scheduling
After you have defined the paths along which site content will be deployed, you must plan the specific jobs to deploy the content. A content deployment job lets you specify that a whole site collection or only specific sites in a site collection will be deployed for a specific path. Jobs also define the frequency with which they are run and whether to include all content, or only new, changed, or deleted content. You can associate multiple jobs with each path. For each path that you have defined, you must decide whether a job will deploy the whole site collection or will deploy specific sites.
As you plan the scope of your content deployment jobs, be sure to think about the order in which the jobs will run. You must deploy a parent site collection or site before you can deploy a site below it in the hierarchy. For example, if you have a site collection with two sites below it, Site A and Site B, and Site A also has two sites below it, Site C and Site D, you must create and run a job that will deploy the top-level site collection, before you can deploy Site A and Site B. You must also deploy Site A before you can deploy Site C and Site D. If you plan to use content deployment jobs that are scoped to specific sites, be sure to schedule the jobs appropriately so that sites higher in the hierarchy are deployed before sites lower in the hierarchy.
You must also decide when and how often to run each job. In general, you should schedule jobs to run during times when the source server has the least amount of activity. Content that is checked out for editing by a user when a content deployment job starts will be ignored by the content deployment job, and it will be copied with the next deployment job after it is checked in. You can configure a job to use a database snapshot of the content database in Microsoft SQL Server 2008 Enterprise Edition to minimize risk to the content deployment job.
If you are using Remote BLOB Storage (RBS), and the RBS provider that you are using does not support snapshots, you cannot use snapshots for content deployment or backup. For example, the SQL FILESTREAM provider does not support snapshots. For more information about RBS, see Overview of RBS (SharePoint Server 2010).
If you will be using a three-stage farm topology, you must also plan for when content is deployed across the farms. For example, if you deploy content from the authoring farm to the staging farm to test and verify content, you should plan to schedule the job that deploys content to the production farm so that there is enough time to resolve any issues that are found on the staging farm.
Do not run content deployment jobs in parallel if the same path is used by both jobs.
For each path, record each associated job in the content deployment planning worksheet. If there is more than one job for a path, insert a row underneath the path for each job to be added. For each job, enter the scope and frequency with which the job will run.
Plan for large jobs
A content deployment job exports all content, as XML and binary files, to the file system on the source server and then packages these files into the default size of 10 MB .cab files. If a single file is larger than 10 MB, such as a 500 MB video file, it will be packaged into its own .cab file, which can be larger than 10 MB. The .cab files are then uploaded by HttpPost to the destination server where they are extracted and imported. If the site collection that will be deployed has a large amount of content, you must make sure that the temporary storage locations for these files on both the source server farms and the destination server farms have sufficient space to store the files. In many cases, you might not know the size or number of .cab files that will be included in the job until you start using content deployment. But if you know that your site is large and will contain lots of content, make sure that you plan for sufficient storage capacity as part of your content deployment topology.
If your site will contain large files, such as video files, you might have to adjust the maximum file upload size for the Web application to accommodate the larger .cab file size. For more information, see Plan for caching and performance (SharePoint Server 2010).
Content deployment planning worksheet
Download an Excel version of the Content deployment planning worksheet (http://go.microsoft.com/fwlink/p/?LinkID=167835&clcid=0x409).
August 12, 2010
Added a note stating that if Remote BLOB Storage (RBS) is used, and the RBS provider does not support snapshots, you cannot use snapshots for content deployment.
May 12, 2010