Design global information architecture and governance
Updated: April 23, 2009
Applies To: Office SharePoint Server 2007
In this article:
Many organizations use Microsoft Office SharePoint Server 2007 to support employees and other contributors working around the world. This article discusses the types of sites and services that are provided by Office SharePoint Server 2007 and how a global workforce affects the deployment of these sites and services. This article will help you determine which of the global solutions is appropriate for your organization. For more information about these solutions, see Supported global solutions for Office SharePoint Server.
This article also includes a design sample for managing information and documents in environments where multiple server farms are deployed. This design sample provides ideas on how an organization can govern content in environments in which information is created across multiple farms. The following poster-size model provides an overview of the design sample: Global Knowledge Management with Microsoft Office SharePoint Server (http://go.microsoft.com/fwlink/?LinkId=110983&clcid=0x409). This model was created in Microsoft Office Visio. If you do not have Visio installed, you can download a free viewer (http://go.microsoft.com/fwlink/?LinkId=73526&clcid=0x409). A plotter works best for printing this file.
Identify global workforce requirements
There are three large categories of sites and services that are provided by Office SharePoint Server 2007:
Collaboration sites Used by employees and teams to collaborate and store data. Collaboration sites include team sites and My Sites. Collaboration includes creating and maintaining lists and libraries, creating and editing documents, and managing sites and libraries. Collaboration sites are also used to store personal and team documents instead of placing these on a file share or storing them on local computers. Using team sites and My Sites for storage ensures that information is uniformly backed up and managed in other ways.
Published content sites Used to host content that has been published and is primarily read-only — for example, content about company policies. Published content is typically authored by a small number of people and then published to many people.
Business intelligence sites and other application sites Used to host applications, such as business intelligence applications. These sites rely on services that run on the server farm and often on content that is stored in back-end data systems. These types of sites can be difficult to duplicate across an organization.
Identifying which of these sites and services are important for your organization can help you determine which of the supported solutions will work best for your organization. In addition, the types of sites and services that you identify influence the overall information architecture and the plan for governing content across your organization.
Plan for collaboration across geographic locations
The biggest consideration in choosing which of the supported solutions will work best for your organization is how important collaboration sites are for your global workers and contributors. Collaborating in Office SharePoint Server 2007 involves the following operations that are sensitive to wide area network (WAN) links:
Opening, uploading, and downloading documents.
Browsing, adding, and modifying data within lists, libraries, and sites.
If collaboration sites are important for your global workers, the following planning considerations and activities can help you decide whether you can host these sites centrally or if you should plan to deploy additional server farms to regions.
Evaluate the WAN links between your global workers and your central site. Determine the average available bandwidth during business hours and the average latency over the WAN links. Latency is the length of time — typically measured in milliseconds — that it takes for data to travel from one end of the WAN link to the other. If necessary, consult with your network provider.
Use the bandwidth data provided in Plan for bandwidth requirements to estimate how long it will take for global workers to complete the types of collaboration tasks that are most common in your organization. Be sure to consider the range of file sizes that are handled by your organization.
If you have already deployed a central site, use existing WAN links to test the types of operations your workers will perform.
Based on your calculations or actual test results, decide if a central solution adequately supports collaboration by your global workers. Factors to consider include:
How often global workers participate in collaboration. If workers are using collaboration sites only occasionally, they might tolerate slower performance over the WAN. However, if workers must frequently wait for operations to complete, and the wait affects how quickly they can perform their job responsibilities, consider optimizing the WAN link or deploying a server farm closer to these users.
How the performance of collaboration sites over the WAN compares to other available technology options. An effective governance plan provides incentives for workers to use the technologies as they are intended to be used. If global workers choose to use e-mail to share documents or to store documents on local computers because these are vastly quicker solutions, they might not adopt the solution and governance plan to the extent that you want.
What the business cost is. If global workers consistently wait for operations to complete, the business is less efficient. Before you decide to deploy an additional server farm to a regional location, compare the cost of this inefficiency to the cost of deploying an additional server farm. Factors to consider include the number of workers that are affected and the cost of the inefficiency to the organization. In some cases, your organization can tolerate slower WAN performance because only a few workers are affected or because the type of work that is affected is not critical to the business.
If you decide that the WAN links will not support a central solution, consider your options for optimizing the overall solution for WAN environments before you decide to deploy regional farms. For more information, see Optimizing Office SharePoint Server for WAN environments.
If you decide to deploy an additional server farm, it is important to coordinate where to deploy and how to assign team sites and My Sites. First, ensure that you offer My Sites on each of your server farms. Assign global workers to create a My Site on the server farm nearest their work location. This provides the best performance experience for your workers and encourages global workers to create and store their projects in Office SharePoint Server 2007. For information about managing multiple My Site applications across an organization and ensuring that workers create a My Site in the correct location, see "Coordinate My Sites across your organization" in the following article: Design My Sites architecture.
Determining the closest server farm to host My Sites for global workers is straightforward. However, determining where to host team sites for specific projects might not be as clear. Global workers can work on teams that are in close proximity or they can work on teams that span geographies. If teams work in close proximity, use the team sites on the server farm nearest the team. If team members are spread across the globe, use the team sites on the server farm that is nearest to where the heaviest collaboration for the project will take place.
In most cases, global workers contribute to several different projects and use content from many different team sites. By hosting My Sites on the farm nearest your global workers and offering team sites in several different locations, global workers will not be negatively affected by slow links all of the time. To support global teams that are not adequately supported by WAN links, consider using Microsoft Office Groove 2007. For more information see Extending Office SharePoint Server global solutions with Office Outlook 2007 and Office Groove software.
If you deploy multiple server farms with collaboration sites, it is important to design an information architecture and knowledge management plan that provides good governance of the content that is created on these sites. For more information, see Design sample: Managing knowledge across geographic locations later in this article.
Plan for published content
Published content is typically created by a few people within an organization and then published for many to access. This content is typically published as read-only, although publishing content by using blogs and wikis provides users the ability to add and modify content.
Determine the available bandwidth and the average latency of your WAN links, and then use the bandwidth data provided in Plan for bandwidth requirements to estimate how long it will take global users to view published content. If you have already deployed a central site, evaluate how long it takes to view published content using your existing WAN links.
If your global workers primarily require access to published content and not necessarily to collaborative content, there are several ways to optimize a central solution to work across WAN links. Any number of these ways can be used together. The following list describes some examples:
Implement a WAN accelerator at one or both ends of a WAN connection.
Use caching on the proxy server or firewall product at regional sites.
Optimize binary large object (BLOB) caching and Internet Information Services (IIS) compression on the server farm.
Optimize pages for quicker page views.
Optimize the browser cache on client computers.
For more information, see the following resources:
If you haven't already decided to deploy one or more regional farms to support collaboration sites, the following factors might cause you to deploy additional server farms to support published content:
Global workers require efficient access to published content to perform their job responsibilities. For example, technical support engineers might require quick access to published content to troubleshoot and resolve issues for customers.
Performance over WAN links is unreliable and requests for page views time out.
If you decide to deploy multiple server farms to host published content, there are several options for duplicating this content. For more information, see "Synchronizing" later in this article.
Plan for business intelligence sites and other application sites
Business intelligence applications rely on access to line-of-business data. In some cases, a business intelligence application represents the aggregation point for access to several different back-end data systems, such as a human resources database, a customer relationship management database, or Siebel.
Typically, line-of-business data is hosted at the central location of a corporation. However, regional sites can also host business intelligence applications.
In an environment with multiple server farms, host business intelligence applications on the farm nearest the data sources.
Design sample: Managing knowledge across geographic locations
If you deploy multiple server farms with collaboration sites, it is important to design an information architecture and content management plan that provides good governance of the content that is created on these sites. The next three sections of this article describe an architecture and governance plan that correlates with the following way in which many organizations manage content:
Authoring Content is authored by individuals or teams where they are located.
Publishing When content is ready to share across the organization, content is published to a central location.
Synchronizing If needed, published content from the central location is synchronized across the organization for more efficient access.
This design sample incorporates two key features:
Parallel content repositories are created across farms.
A governance plan is implemented to coordinate content across farms.
The governance plan relies on users to decide when content is ready to share. At this time, the recommendations in this design sample apply to authoring and sharing documents across an organization. These recommendations do not apply to sharing other types of content such as lists or sites.
Designing the content architecture
Before content authoring begins, optimize information architecture and governance across your environment in the following ways:
On each server farm, create parallel sites for collaboration, published projects, and company information.
Host team sites at each of the server farm locations.
Actively manage collaboration sites on all farms with an appropriate service level agreement (SLA) to support content creation at those locations. For example, back up collaborative content more often than published content. On regional farms, content published from the central site does not require the same level of management as the collaborative content.
The following illustration shows a central server farm and two regional server farms.
In the illustration:
The site architecture is mirrored across the environment. Each server farm hosts team sites, sites for published projects, and sites for company information.
Content authoring takes place on the server farm located nearest the team.
When designing the content architecture, consider that there are a variety of ways that content can be duplicated across farms. To maximize the options and to simplify management of duplicated content, use the following recommendations:
Partition content into separate site collections. For example, in the previous diagram we recommend that you host published projects in a single site collection.
Store content in dedicated databases based on the unit of content that you want to duplicate. For example, in the previous diagram we recommend that you store all content in the published projects site collection in a dedicated database.
When content authoring begins, teams create documents within team sites. Bring collaboration closer to your teams by:
Creating project sites on the farm nearest the workgroup.
Using Office Groove 2007 to support team members who are separated by WAN links with high latencies.
The information architecture within the team sites can be highly structured or loosely structured, depending on the needs of the organization. Typically only a small percentage of the content created on the team sites will be shared across the organization. The information architecture within the team sites should be open enough to foster the types of collaboration that are important to your organization but structured enough to maintain governance of the content.
On the regional farms, management of content within the team sites is critical because this content represents the research and development of new intellectual property. Consequently, backup and recovery plans for team sites should be the highest priority.
When content is ready to share across the organization, encourage team members to publish the content to the central farm. In this scenario, publishing is defined as copying documents from the team sites on any of the farms to the published projects site collection, or other designated site collection, on the central farm.
In the following illustration, projects that are created in team sites are published to the published projects site collection or the company information site collection on the central site.
There are several options for publishing documents:
Users can use the Send To feature by clicking on the down-arrow next to a document in a library and selecting Send To and clicking Other Location. If you choose this option, you ensure that users are aware of the URLs to publish content to.
Use the publishing application programming interfaces (APIs) to create a workflow that copies content from the authoring site collections on regional farms to a central location on the central farm. You can add "Publish to central farm" or other unique command to the menu of options for documents (similar to the Send To feature). This option requires adding custom code to the sites. However, you can add customizations that greatly increase efficiency and accuracy for users. For example, you can provide a menu of publishing locations for users to select. You can also provide an option to alert users if a document of the same name already exists in the selected location and provide the option to overwrite such documents.
In addition to creating site collections for published projects and for company information, you can create document libraries within these site collections to provide more detailed categories of content.
In the following illustration, the published projects site collection on the central farm includes a document library for each category of content that is created.
These document libraries can be assigned permissions for target audiences. For example, the Sales Tools folder can be configured with read permissions for all members of the sales team. When a document is published to a specific folder, such as Sales Tools, the document inherits the permissions of the folder. This provides a way to manage permissions across the organization based on categories of content and target audiences.
Finally, because the published projects site collection contains the aggregated intellectual property that is developed in your organization, actively manage the published projects site collection at the central site with appropriate backup and recovery procedures.
Theoretically, content that is published to the central repository is available to the entire organization, based on permissions. However, analysis of your WAN links might indicate that accessing published content at the central repository is inefficient or not practical. (See Plan for published content earlier in this article.)
Based on your findings, you might choose to create copies of published content to the regional farms to make published content more accessible to your global workers. In this scenario, synchronizing is defined as copying read-only versions of documents from the central repository to regional farms.
In the following illustration, content that is published to the central site is duplicated at each regional farm. Access to the content is governed by the permissions that are assigned, regardless of where it is duplicated.
There are several options for synchronizing content to regional farms:
Restore or attach a database copy.
Use a partner solution to copy or replicate content over the WAN. For more information see "Data Replication, Multi Master Synchronization, and Configuration Management" in Optimizing Office SharePoint Server for WAN environments.
The publishing features of Office SharePoint Server 2007 have not yet been tested by the product team in WAN environments.
Because synchronized content is a replica of content that is stored in the central site, you can rely on active management of published projects at the central site, rather than actively managing this content at each of the regional sites. At the regional sites, prioritize active management of the collaboration sites instead.
Download this book
This topic is included in the following downloadable book for easier reading and printing:
See the full list of available books at Downloadable content for Office SharePoint Server 2007.