Export (0) Print
Expand All

Designing Knowledge Management Solutions with a Web Storage System

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.
Updated : June 14, 2001

Walson Lee

Microsoft Corporation

Summary: This article outlines a design process for developing effective knowledge management solutions using the Web Storage System.

On This Page

Introduction
Web Storage System as a Development Platform
Building KM Solutions
Microsoft Solutions Framework: Services-Based Application Model
MSF Design Process
KM Solutions Design Model
Best Practices for User Services Design
Best Practices for Business Services Design
Best Practices for Data Services Schema Design
Best Practices for Web Storage System Folder Structure
SQL and the Web Storage System
Physical Design Considerations
Security Model
Performance
Scalability and Availability
Guideline Review
Implementation of Taxonomy
Integration with Line-of-Business Applications
Conclusion

Introduction

Microsoft® Exchange 2000 Server is the first Microsoft product to introduce a new storage technology called the Web Storage System. The Microsoft Web Storage System offers many new development features, such as Web Storage System events and forms, workflow engine, content indexing, and search folder. These features are particularly suitable for knowledge management (KM) solutions. However, there will be an initial learning curve for KM solution developers to understand these features and sort through many design options that are available with the Web Storage System. This article specifically addresses the design aspect of developing KM solutions and discusses best practices, design patterns, and design considerations. It presents a service-based application model and a design process, based on Microsoft Solutions Framework (MSF), tailored for building KM solutions with the Web Storage System. The design process covers a conceptual design model, a logical design model, and a physical design model. This article focuses on the physical design model for the following design considerations specific to the Web Storage System:

  • User services—digital dashboard and Web Storage System forms

  • Business services—workflow and events design

  • Data services—store schema design

  • Security model

  • Performance

  • Scalability and availability

  • Implementation of taxonomy

  • Integration with Line-of-Business (LOB) Applications

The purpose of this paper is to present proper ways to design a KM solution based on the Web Storage System technology. The target audience is KM solutions architects or designers. Other developers can also benefit from this paper on basic design concepts.

Web Storage System as a Development Platform

The Web Storage System is one of four initiatives that Microsoft has announced to deliver its vision of "knowledge workers without limits." A primary goal of these initiatives is to remove barriers to collaboration that today's knowledge workers face. The Web Storage System combines the features and functionality of the file system, the Web, and a collaboration server in a single location for storing, accessing, and managing information, as well as for building and running applications. Every item in the Web Storage System is URL-addressable and fully supports semi-structured data, such as documents, contacts, messages, reports, HTML files, and Active Server Pages (ASP). The Web Storage System provides strong integration with Microsoft Office 2000. It establishes a platform for information management that includes consistent search and data categorization.

Figure 1 illustrates the Web Storage System's programming model. It shows the support for various protocols, data access methods, and event models. Data access to the Web Storage System includes support for OLE DB and ActiveX® Data Objects (ADO). The Web Storage System also provides access using the HTTP protocol, which has been enhanced through the WebDAV specification to support an additional set of protocol commands. In addition, the store provides native Extensible Markup Language (XML) support.

The Web Storage System also includes new features, such as Outlook® Web Access, Web Storage System forms, events, workflow, content indexing, search folder, and instant messaging. These features provide developers a great deal of flexibility and capability for building KM solutions. For further information on the Web Storage System, please refer to the Exchange 2000 SDK and the MSDN Exchange Server Developer Center.

Cc750184.dsgnkm01(en-us,TechNet.10).gif

Figure 1: Web Storage System programming model

Building KM Solutions

For each business problem in an enterprise, knowledge management (KM) evolves through choosing the right modules for the problem to be solved. Each module has its own characteristic based on organizational processes and technologies. The following is a list of typical characteristics:

  • Increasing customer/partner/employee knowledge

  • Rapid learning and redeployment of knowledge

  • Increasing value of intellectual property

  • Adding unique values in products and services

  • Creating new knowledge

  • Sharing knowledge of work processes and quality innovations

Figure 2: KM enabling modules

Figure 2: KM enabling modules

Two prerequisite technologies, Complete Intranet and Messaging and Collaboration, are the foundation for all KM systems. These technologies build an infrastructure that supports the efficient transport, structure, access, and collaborative management of information.

The remaining KM-enabling modules extend that basic infrastructure to a sophisticated KM system that includes services like content management, variations of information delivery, and data analysis. Other services, such as data tracking and workflow processes, are also included as part of the community and team modules.

The implementation of the KM-enabling modules can be plug-and-play. Although some of the modules benefit from the implementation of a previous module, they can be chosen in any order related to the specific business case that needs to be developed. For example, real-time collaboration services, such as video conferencing, can be easily included on top of the prerequisite technologies, but are enhanced by the metadata services provided in the content management module.

Figure 3: Possible layers of a knowledge management platform

Figure 3: Possible layers of a knowledge management platform

Microsoft's current KM platform is the Microsoft BackOffice® family. It provides the services to build the KM prerequisites (Messaging and Collaboration and Complete Intranet) and to extend them to KM solutions by implementing all KM-enabling modules (content management, communities and teams, portals and search, data analysis, and real-time collaboration). Besides these services, BackOffice provides interfaces for connecting and integrating with legacy information or knowledge sources.

In the coming months, Microsoft will release the .NET Enterprise Server, which includes SQL Server™ 2000, BizTalk™ Server, Commerce Server 2000, Host Integration Server 2000, Internet Security and Acceleration Server 2000, Exchange 2000 Server, and Application Center 2000. These components are designed to work closely together to build the next generation of Web applications. The focal point of this document is the Web Storage System, which is the underlying storage technology for Exchange 2000 and future Microsoft products. The Web Storage System is a development platform for building and providing the following key knowledge services:

  • Search and deliver

  • Collaboration

  • Document management

  • Tracking and workflow

For additional information, please refer to the Building Knowledge Management Solutions white paper.

Microsoft Solutions Framework: Services-Based Application Model

To establish a foundation for the following discussion of how to design KM solutions, we will briefly summarize the Microsoft Solutions Framework (MSF) services-based application model based on MSF white papers. For additional information, please see the Microsoft Solutions Framework white papers.

MSF advocates a service-based application model for designing and implementing distributed components and business solutions. The term "service-based application model" means that the application functionality is defined as a set of collections of services. In the MSF view, an application is constructed from a logical network of consumers and suppliers of services. In this model, a consumer might be a user or another service component. These services can be distributed across both physical and functional boundaries to support the needs of many different applications.

What is a service? A service is a unit of application logic that implements an operation, function, or transformation that is applied to an object. Services can enforce business rules; perform calculations or manipulations on data; and expose features for entering, retrieving, viewing, or modifying information.

To further refine the distribution characteristics of the network of services, the MSF application model defines three categories of services that make up an application:

  1. User services are the units of application logic that provide an interface for an application. The user of an application can be a user or another application; therefore, an application's interface might be a graphical user interface (GUI) and/or an application programming interface (API).

  2. Business services are the units of application logic that control sequencing and enforcement of business rules and the transactional integrity of the operations they perform. Business services transform data into information through the appropriate application of business rules.

  3. Data services are the units of application logic that provide the lowest visible level of abstraction, used for the manipulation of data. Data services maintain the availability and integrity of both persistent and nonpersistent data as a corporate asset. They provide create, read, update, and delete services so that business services (the consumers of data services) need not know where the data is located, how it is implemented, or how it is accessed.

MSF Design Process

The process of designing business solutions can be compared to the process of designing and constructing a building. A good architect gets to know the client to understand what the client wants. In system design, as in architecture, multiple views describe the end product. Each view is prepared for a different audience and contains different levels of detail. The same is true for KM solutions design—there are different application focuses and skill sets. We need a way for designers—who concentrate on user interface, business process, or database issues—to coordinate and synchronize their work, allowing them to bring their specialized expertise to bear in an effective and organized manner that produces a holistic balanced design.

The MSF design process has three phases:

  1. Conceptual design

  2. Logical design

  3. Physical design

Conceptual design

Conceptual design is about clearly understanding the problem to be solved and framing a solution to that problem in terms that both management and users can understand. It is a broader view of the problem than just gathering requirements. It is also about keeping those requirements in context and making rational decisions.

Conceptual design distills the essential tasks and information required to carry out the activities of the business, resulting in a view of the solution that is both process-focused and user-centric.

In MSF, scenarios are the key results of the conceptual design process. A scenario describes a behaviorally related sequence of tasks or transactions that a user performs in some business context. Scenarios must capture the requirements of the business solution (process-focus) in terms of the users who are responsible for the work (user-centric).

Logical design

Logical design is the process of describing the solution in terms that define the parts of the system and how they interact. This process organizes the logical structure of a new system and illustrates how the system is assembled and its interfaces with the outside world.

The logical design process must promote a greater understanding of the system by the project team. This is the primary consideration when determining the level of detail that should be included in the design. Logical design provides the organization and structural rules required for independent team members to work effectively in parallel and provide the basis for coordination with external projects and architects.

Logical design provides a baseline for evaluating various physical design options. The organization of the logical elements can potentially be achieved through a variety of physical designs. Work on the logical design will overlap with work on the physical design in an iterative process. This allows the team to incrementally optimize the system.

The goal in logical design is to lay out the parts in the system, describe how they are connected, and define what one can do with each of them. Remember that conceptual and logical designs are tightly related. Logical design describes how the system accommodates each scenario of the conceptual design.

The team starts the logical design process by defining the major modules of the system. A module represents some collection of processes that work together to accomplish a task. The team must specify each element, the responsibilities of each element, and how each element interacts with other elements. The output consists of:

  • Core functional areas or elements

  • Activities and functions of those areas

  • Connections between areas

Physical design

Physical design is the process of describing components, services, and technologies of the solution from the perspective of the development team. The purpose of the physical design is to apply real-world technology constraints to the logical model, including implementation and performance considerations.

The output of the physical design process is a set of components, user interface design for a particular platform, and a physical database design. Physical design provides the basis for the functional specification that the development team, the testing team, and the deployment team can use as a basis for quality assurance.

The physical design process contains the individual steps of research, analysis, rationalization, and specification:

  • Physical design research involves determining physical constraints of the infrastructure and physical requirements of the solution, and managing risks from this conflict between physical constraints and requirements.

  • Physical design analysis involves selecting candidate implementation technologies and drafting a preliminary deployment model composed of network, data, and component topologies.

  • Physical design rationalization involves determining a packaging and distribution strategy, decomposing objects into services-based components, distributing components across topologies, and refining packaging and distribution.

  • Physical design specification involves determining the programming model, specifying the component interface, and understanding component structure considerations.

KM Solutions Design Model

So far, we have examined the key concepts of building KM solutions and the MSF application model and design process. Now it's time to put everything together to focus on how to design KM solutions based on the MSF design process.

We will use the MSF service-based application model as a roadmap for our discussion. In designing a typical KM solution, we have to ponder questions such as:

  • What are our design objectives?

  • Do we have scenarios defined to capture user and business requirements?

  • Do we have enough information to define a set of services and their interfaces?

  • What are the infrastructure and technological constraints once we decide the implementation technologies?

  • Do we have the object model defined?

The following table illustrates an example of a KM solution design model based on a fictional Exchange 2000 sample application.

Table 1 Knowledge management solution design model

Services Layer

Conceptual Design (Scenarios)

Logical Design (Objects/Services)

Physical Design (Components/Technologies)

User Services

Example scenario: Building community forums and allowing flexibility by adding forums dynamically on demand.

A Web-based virtual community, containing the following services:
· Industry news
· Collaboration
· Best practices
· Shared contacts
· Easily located information

An Exchange 2000-based digital board with different Web parts corresponding to services defined in the logical design.

Business Services

Example scenario: Request-for-quote (RFQ) documents review and approval process.

· Generating RFQ service
Transforming RFQ to XML document based on BizTalk framework
· RFQ approval process

· Event sinks to generate and validate properties of RFQ
· Using workflow engine to implement the RFQ approval process
· Using XMLHTTP, XML DOM, XSLT to implement the RFQ transformation process

Data Services

Example scenarios:
· A central information repository for all relevant project documents and associated design documents for an engineering organization.
· Allowing team members to share/review documents with a proper security model.

Logical schema design for the following objects:
· Project
· Document
· Team-member

Physical schema design based on a Web Storage System:
· Folders structure
· Schema folder
· Custom content classes and properties
· Security XML descriptor templates

Following are some general best practices or recommendations for working on a KM design model. We will cover specific topics in the following sections.

  • During the conceptual design phase, the key focus should be defining scenarios that can capture business processes and requirements. The scenarios should be defined in the context of a business problem space and not a solution space.

  • During the transition from the conceptual design to the logical design, the development team can go through a set of scenarios and apply sound object-oriented (OO) design techniques, such as user case analysis, to identify candidate services and/or objects. These candidate services/objects form a basis for the logical design model. This is typically an iterative process, that is, it may take several passes to complete. Figure 6 illustrates a sample user case diagram based on Microsoft Visio® 2000 online documentation. This diagram originally appeared in object-oriented analysis and design materials written by Craig Larman for ObjectSpace (http://www.objectspace.com ).

    Figure 4: Sample user case diagram

    Figure 4: Sample user case diagram

    During the logical design phase, the team should focus on services and objects design, independent of technologies and platforms. This is difficult for most developers. Some development teams might be tempted to skip the logical design phase altogether and jump straight to the physical design. This definitely is not a good idea. A logical design model provides many benefits, such as:

    • Providing the organization and structural rules required for independent team members to work effectively in parallel.

    • Serving as a basis for coordination with external projects and architects.

    • Reducing complexity.

    • Providing opportunities for optimizing the design based on user requirements (that is, scenarios).

  • During the transition from logical design to physical design, a development team can start the physical design using draft services and objects defined during the logical design phase. All the team members and other stakeholders of the project must first understand what the solution and the overall structure of the system should look like, including interconnection among different parts of the system. An interaction diagram (sequence diagram) as defined in the Unified Modeling Language (UML) is a preferred way to capture the dynamic interaction aspect of the system. Figure 7 shows a sample sequence diagram taken from the Microsoft Visio 2000 online documentation. This diagram originally appeared in object-oriented analysis and design materials written by Craig Larman for ObjectSpace (http://www.objectspace.com ).

  • During the physical design phase, the development team should focus on design considerations that can optimize or improve the design model. The rest of this paper focuses on the best practices for design considerations that are specific to a Web Storage System.

    Cc750184.dsgnkm05(en-us,TechNet.10).gif

    Figure 5: A sample sequence diagram

Best Practices for User Services Design

As stated earlier, user services are the units of application logic that provide an interface for an application. The design activities are normally centered on a graphical user interface (GUI) and/or an application programming interface (API). The following is a list of best practices for designing user services with a Web Storage System:

Identify common user services by examining key usage scenarios

During the logical design phase, the team can examine the usage scenarios, in particular those scenarios that interact with users (or "actors" in UML). In most cases, it will be fairly easy to identify user services from the scenarios. However, finding reusable user services will require some additional effort and experience.

Example: In designing an employee KM portal Web site, it is possible to reuse the content search user service and the taxonomy selection user service throughout the Web site.

Use the new digital dashboard framework

A quick overview on digital dashboards: A digital dashboard is a customized solution for knowledge workers that consolidates personal, team, corporate, and external information and provides an easy access to analytical and collaborative tools. Companies can quickly build and deploy their own customized digital dashboard solutions using the Digital Dashboard Resource Kit (DDRK) 2.0. The DDRK includes all the necessary tools and documentation, sample dashboards, and components ready to be used in a variety of digital dashboards.

A digital dashboard is made up of Web parts—reusable components that can contain any kind of Web-based information. Web parts are easy to build; end users can create simple Web parts in a dashboard. Developers can create more complex Web parts using the Web Part Builder.

Digital dashboard applications typically have an enhanced user interface that combines familiar Microsoft Office features with easy-to-use Web browser-style controls. Users are one click away from simple tools that allow them to customize their digital dashboards, create new Web parts, or import Web parts from Web part libraries on the Internet or on a local intranet. Figure 8 shows a digital dashboard for a fictional company called Adventure Works. This digital dashboard contains Web parts that display the user's Inbox, MSN Messenger Service, calendar, and critical information about the company.

Cc750184.dsgnkm06(en-us,TechNet.10).gif

Figure 6: A sample digital dashboard

Implement common user services as Web parts

At the heart of the digital dashboard are Web parts. Web parts are reusable components that contain Web-based content such as XML, HTML, and scripts, and have a set of standard properties that controls how the Web parts are rendered in a digital dashboard. These properties make Web parts and dashboards storage-neutral and completely reusable.

Because Web parts adhere to a common standard, you can store them in libraries from which you can draw to assemble all digital dashboards in your organization. Many Web part and dashboard properties are user-specific, but as an administrator, you can control the extent to which a user can modify Web parts or dashboards.

Define a consistent look and feel through UI design guidelines

It is a good idea to define a set of UI design guidelines, that is, a consistent look and feel for your KM solutions. For example, to provide a better user experience with your KM portal Web site, you need to design consistent UI for common KM user services such as:

  • Navigation service

  • Content search service

  • Taxonomy selection service

  • Content presentation service

Use Outlook Web Access as much as possible

You can create customized Web pages simply by reusing pieces of Outlook Web Access. These pieces can be embedded in a Web page. Tables, frames, and iFrames can be used to arrange the Outlook Web Access pieces. Outlook Web Access provides default views for everyday tasks—for example, viewing an Inbox or outboxes.

Default views can be manipulated by specifying additional parameters. For example, a Web page could contain the user's Inbox and the group calendar. It could have a brand or company identifier in one corner and current news and links to internal tools in another corner. Because using Outlook Web Access can significantly reduce your development effort, you should use it as much as possible.

Use Web Storage System forms for custom content classes and properties

If you have designed your own custom content classes and properties, you should consider using Web Storage System forms. The Web Storage System form is a Web-based form technology that is built on Internet standards. A Web Storage System form is a Web page that is registered in the Web Storage System. The registration itself is a single record in the Web Storage System store.

Web Storage System forms are designed to work with browsers that comply with the HTML 3.2 standard. Browsers that support those features include Microsoft Internet Explorer 3.0 or later, and Netscape Navigator 3 or later.

What is so special about Web Storage System forms? Web Storage System forms are:

  • Data-centric. A browser requests the URL of an item from the store. The store executes the form that is appropriate for the item requested.

  • Adaptive. A form only needs to know how to handle a particular language, browser, or operation. The store adapts the request to ensure that the appropriate form is executed.

What is a form anyway? It is a fairly loose term, typically related to an HTML page bound to data (in a store) over the HTTP protocol. A more formal definition is "A process that may interact with the user via HTML and may manipulate data, all in a response to a user's actions, communicated over HTTP," according to Jamie Cool, Program Manager of the Microsoft Exchange Product Group.

It is a good idea to understand how the Web Storage System form works. Everything in the Web Storage System is URL-addressable. When accessed on the Web, Outlook Web Access provides a default rendering for all items in the Web Storage System. The forms registry allows developers to override the default rendering in Outlook Web Access.

When Exchange receives an HTTP request from the user's browser, the request is transferred to Microsoft Internet Information Services (IIS). IIS invokes an ISAPI DLL. This is the same DLL that the Web Storage System uses to process all HTTP/DAV requests. The ISAPI DLL checks the form registry for a form registration. The form registration provides a set of form-specific attributes, such as content class, user action, language, browser type, item state, and two important attributes:

  • Execute URL. The URL to execute to render a form. It could be an ISAPI filter (example: /exchweb/bin/exwform.dll) or an ASP page (example: process.asp).

  • Form URL. The URL of the form or template being handled and rendered; the item that is denoted by the current URL (examples: ExpenseForm.htm, ECOform.ASP).

Information read from the HTTP request header is processed and compared against browser information stored in Browsecap.ini to derive browser capabilities. The ISAPI DLL uses a best-fit comparison with the forms registry information to determine which form to display.

For additional information about the Web Storage System forms, please refer to the Exchange 2000 SDK.

Best Practices for Business Services Design

As stated earlier, business services are the units of application logic that control sequencing and enforcement of business rules and the transactional integrity of the operations they perform. The following is a list of best practices for designing business services with a Web Storage System:

Identify key business processes through usage scenarios

During the logical design phase, the development team should review the scenarios they gathered during the conceptual design phase to identify business processes, such as a document approval process or a content transformation process.

Determine implementation mechanisms

During the physical design phase, the development team needs to determine the most suitable implementation mechanisms for these business processes. There are basically four options: workflow engine, event sink, COM+ component, and scripts (either client-side or server-side scripts). The script approach creates some difficulties, such as difficult-to-maintain code and the limitations of scripts. Hence, we recommend the first three approaches. The following is a set of guidelines for determining an implementation approach:

  • Use workflow if the business processes are:

    • Involved with multiple users and multiple resources.

    • Complicated processes such as approval or business validation processes.

  • Use event sinks when there are:

    • Small numbers of users or resources involved.

    • Simple validation processes.

    • Storewide events.

    • Timer events.

  • Use COM+ components if there are mostly read operations rather than update operations with the Web Storage System, and if no workflow is involved.

Best Practices for Data Services Schema Design

As stated earlier, data services are the units of application logic that provide the lowest visible level of abstraction, used for the manipulation of data. Data services maintain the availability and integrity of both persistent and nonpersistent data as a corporate asset. We will discuss Web Storage System schema design in this section and the folder structure in the next section.

First, what is a schema? And why is schema design critical for the overall physical design model based on the Web Storage System technology? The term schema means a method for defining and organizing the data (sometimes called metadata). In a Structured Query Language (SQL) relational database, the schema includes all the table definitions and column definitions, plus additional information such as indexing and triggers. With the store, we focus the schema design on the content class and the property set associated with a content class. The schema design has a direct impact on overall success of a KM solution, particularly in terms of performance and extensibility. Schema design is normally the first step for defining the data services model. Many of the design considerations and design decisions depend on the schema design. The following paragraph provides a brief introduction to the Web Storage System schema.

The Web Storage System gives you the ability to define schemas for your applications. Web Storage System schemas center around the content class. The content class defines the schema class for the item/instance in the store and is the logical container for the property set. When creating schema definitions for your applications, you define your custom content classes and associated properties. The Web Storage System comes with a large number of predefined content classes and properties. You can either use or extend (subclass) one of these predefined content classes when creating your own custom content class. These include, but are not limited to, the content classes listed in Table 2.

Table 2 Content classes

Content Class

Description

urn:content-classes:item

Base class for any item in the store

urn:content-classes:message

Base class for messages

urn:content-classes:calendarmessage

Base class for meeting requests and responses

urn:content-classes:appointment

Base class for appointments

urn:content-classes:person

Base class for contact items

urn:content-classes:folder

Base class for folders

urn:content-classes:document

Base class for Microsoft Office documents

One benefit of Web Storage System schemas is that they provide a way for schema-aware applications and tools to discover the names of content classes and properties that apply to a particular application.

The schema information that pertains to a particular application is controlled through a folder's schema scope. A folder's schema scope is a set of folders, traversed in a particular order, that contain schema definition items. You can extend the schema on a folder-by-folder basis by defining the list of folders in the Web Storage System in which you are storing your schema information. The scope can be simple, consisting of only the global schema folder, or complex, containing a large list of folder URLs.

The following two properties are examined to define a schema scope and are also important for the overall schema design—particularly the folder structure, which is the topic we will discuss in the next section.

  • schema-collection-ref (SCR). This property is a URL for a folder in which to search for content class and property definitions. This is the first folder searched for schema definition items and is always the first folder in a folder's schema scope. If this property is not set, the default is that store's non_ipm_subtree/Schema folder, which contains the Web Storage System default schema definition items.

  • Baseschema. This property is a multivalued string containing URLs for one or more folders. You extend the schema scope for a folder by identifying other folders that contain schema definition items.

In addition to defining custom content classes, defining customproperties is another important aspect of the schema design. Although the Web Storage System provides many predefined properties, you can store any number of additional properties with each item; these are called custom properties. You can define a different set of custom properties for each item.

A custom property is saved with its associated item and can be requested by name when examining the item. When you bind directly to items using the Exchange OLE DB provider or ADO, or when you issue a PROPFIND command with a depth of 0 through the HTTP/WebDAV protocol, the Web Storage System returns all custom properties for the item. Custom properties are not visible to an SQL 'SELECT *' statement or a PROPFIND command for all item properties with a depth of 1, unless they are defined as part of the item's content class. Therefore, to make your properties discoverable to schema-aware applications, you must add both property and content class definitions to the application folder's schema scope.

Next we present a summary of some general schema design guidelines. If you are not familiar with the terms URN, URI, and URL, you might want to look at the following definitions before you proceed.

URI, URN, and URL

A Uniform Resource Identifier (URI) is simply a formatted string that uniquely identifies a resource. A URI can be a Uniform Resource Locator (URL) or a Uniform Resource Name (URN).

  • URLs encode the underlying protocol needed to locate the resource being identified.

  • URNs, on the other hand, are location-independent and in no way imply any protocol or mechanism for locating the resource being identified.

URLs begin with a prefix that identifies the protocol followed by a protocol-specific string. In the case of HTTP URLs, the syntax is as follows:

  "http://" <host> [":" <port>] [<path> ["?" <query>]]

<host> is the IP address of the server, <port> is the TCP port number the server is listening on, and <path> is the absolute URI to be passed as the Request-URI in the HTTP request. The optional <query> corresponds to the query string suffix, an ampersand-delimited list of key/value pairs. Only the host part of the URL is mandatory. If no port is specified, port 80 is assumed; if no path is specified, the Request-URI will be '/'.

URNs are much less understood, yet are critical to building modern, Internet-friendly applications. There is no generic way to dereference a URN to find the resource it identifies. The syntax of a URN is structured to ensure that URNs are unique across disparate organizations. The syntax is as follows:

  "urn:" <NID> ":" <NSS>

<NID> is the namespace identifier, and <NSS> is a namespace-specific string. URNs are the preferred mechanism for identifying things that are location-independent. URLs are the preferred mechanism for identifiers that also need to contain location information.

Schema Design Guidelines

Guidelines for schema design are as follows:

Use and define the namespaces (URNs)

It is a good practice to define properties and content classes using namespaces. Purposes of namespaces include:

  • To help ensure that property and class names are globally unique; that is, to solve the problems of recognition and collision. This is particularly important if you have multiple applications deployed at the same time and for independent software vendors (ISVs) to deploy their applications in a large organization.

  • To indicate the individual or organization that "owns" the property or class definition.

In the predefined properties and classes that ship with Exchange 2000, you will notice a number of different styles of namespaces:

  • urn:schemas:httpmail:

  • urn:schemas-microsoft-com:exch-data:

  • urn:schemas-microsoft-com:office:office

  • http://schemas.microsoft.com/exchange/

The first three namespaces are all URNs. The first example is intended to be a well-known published namespace for interoperability among schema-aware applications.

The second two are proprietary URNs. If you wanted to create a namespace like this for your application, you could create urn:schemas-mycompanysdomain-com:myapplication:.

The difference between the second and third namespace is that the second namespace has a namespace delimiter at the end and the third one does not. If a namespace ends with a delimiter of ":" or "/", then to create a property or content class name, the property name is appended to the namespace. For example, a property in the second namespace is urn:schemas-microsoft-com:exch-data:ismultivalued.

If a namespace does not end in a delimiter (as in the third case), then to create a property name in that namespace, a "#" is placed between the namespace and the property name. For example, a property in the third namespace is urn:schemas-microsoft-com:office:office#Author.

The last namespace example shows how to use URLs as namespaces. You should choose URL-based namespaces from URLs that you own or have registered. This will help to ensure unique namespaces. When URLs are used as namespaces, the final delimiter is a "/" character.

You should not add properties or content classes to namespaces that you do not own. For example, it is not a good practice to add properties to the http://schemas.microsoft.com/exchange/ or DAV: namespace. Instead, you should create your own namespaces for your content classes and properties.

Work on properties definition

The Web Storage System itself places no special limitations on the characters allowed in property names. However, there are a few conventions that are good to follow:

  • Properties should be created using a namespace (as discussed above) plus an identifier. For example, urn:schemas-sample-com:engineering:eco.

  • Properties should be well-formed URIs.

  • Property names should not have spaces in them because XML (and therefore HTTP-DAV) does not support spaces in element names.

Define custom content classes

After these custom property definitions are done, the next step is to define your custom content classes. First you need to choose a folder for your application to store schema information. You can store this in the same folder as your application data, but it is highly recommended that you use a separate subfolder, which we will refer to as a schema folder. If the schema you are defining is not specific to a single application, you can define it in a top-level schema folder in the relevant public store.

Where you store and how you organize your schema definitions is up to you. However, we are going to offer a set of recommendations in the next section on how to organize your folder structure and how you can determine which schema definitions apply to a particular set of application data.

The following diagram illustrates a sample custom content class definition by using one of the Exchange 2000 SDK tools, Web Storage System Schema Designer. We recommend that you use this tool (or a similar one) to define your custom content class definitions and property definitions.

Cc750184.dsgnkm07(en-us,TechNet.10).gif

Figure 7: A sample schema design

Consider inheritance of content classes

You can certainly define a brand new custom content class from scratch. However, most content classes will extend ("inherit") an existing content class. To extend a content class means that all of the properties on instances of the extended (derived) content class also exist on instances of the extending (base) content class. This is a similar concept to class inheritance in object-oriented (OO) programming languages such as C++.

Figure 10 shows a simple inheritance scenario. Extending the document class means that any code or operation that can be performed on an instance of the document class can also be performed on an instance of the expensereport class.

Figure 8: Simple content class inheritance

Figure 8: Simple content class inheritance

Figure 9: Content classes with multiple inheritances

Figure 9: Content classes with multiple inheritances

Content classes can also extend to more than one content class. In Figure 11, we still have an expensereport class that has properties totalcost and approvalstate. However, in this scenario we want to be able to have expense reports that are a special class of document and expense reports that are a special class of message. So we create expensereport to extend the class. Then we create an expensemessage class and an expensedocument class, which have no additional properties of their own. Expensemessage extends both expensereport and message, while expensedocument extends both expensereport and document. Now, applications that understand the message class can understand some properties of the expensemessage class and treat the rest as custom. Applications that understand the document class can understand some of the expensedocument class properties.

For additional information about schema design, please refer to the Exchange 2000 SDK and the white paper, "Web Storage System Schema: Usage and Best Practice Guide."

Best Practices for Web Storage System Folder Structure

The Web Storage System provides a great deal of flexibility for designing folder structure. Schema definition items can be placed in any folder in a particular store and then used to define schemas for your application. By appropriately setting the schema-collection-ref and baseschema properties for various folders, you can bring these definitions into scope.

To avoid complexity when designing and managing your application's schema, plan and organize your schema information. For example, one option is to create folders under your top application folder that are designated as schema folders. There are many ways to design a folder structure. The following steps outline this process.

Consider the following:

  • Complexity of a logical model. As discussed earlier, there are many ways to design a schema for the store. You should examine all relevant information and design options before you settle on a final design decision.

  • Complexity of a physical model. You should consider the difficulty of maintaining a complicated folder structure.

  • Performance impact, which we will discuss in a later section.

  • Reuse and sharing of the schema.

Separate the schema folder from other types of folders, such as:

  • Application folder, which contains ASP pages, HTML pages, Web Storage System forms, and so on.

  • Data (content) folder, which contains data items or documents.

  • Form registration folder, which contains form registration items.

Typically, schema definitions for a particular application will be placed in their own folder. It's also a good idea to have a separate application folder and a data folder. As discussed previously, the schema-collection-ref (SCR) and the baseschema properties of a particular folder determine the schema scope. There are many flexible ways to design a folder structure.

Examples of folder structures are:

  • A simple application can have one folder combining both application files (ASP, HTML pages) and data items, and one schema folder.

  • A slightly more complicated application can have a separate application folder, data folder, and schema folder.

  • There can be different levels of schema folders, such as a top-level schema folder covering all applications running under the same root. Or there can be a chain of schema folders; that is, each schema folder refers to another schema folder via the SCR.

Define the schema folder.

  • Define commonly used content classes and properties definitions.

  • Define form registration. (This might be in a separate form registration folder.)

When creating schema folders, you have to determine the scope of those folders. That is, which data folders will a given schema folder apply to? A schema folder can be used by any number of data folders. Conversely, a data folder can have a schema applied to it that is defined in multiple schema folders.

As discussed earlier, the schema-collection-ref is a property that can be set on data folders to indicate which schema folder to search first to find relevant property and content class definitions. The baseschema property forms a tree of schema folders to search for schema definitions. Each node in this logical tree of schema folders can have any number of children. This logical tree of schema folders can (and often will) differ from the physical layout of folders in a store. The schema-collection-ref property indicates, relative to a given data folder, in which schema folder this search begins.

  • Define application and data folders.

    • Point the schema-collection-ref (SCR) property to the schema folder if necessary.

    • Use baseclass and expected-content-class.

SQL and the Web Storage System

In this section, we examine some major differences between SQL relational databases and the Web Storage System, discuss when to use SQL and when to use the Web Storage System, and offer a set of guidelines for migrating data from existing SQL databases to the Web Storage System.

Table 3 illustrates the differences between SQL databases and the Web Storage System. Note that there are a set of common services for both SQL databases and Microsoft products that are based on the Web Storage System, such as Exchange 2000.

Table 3 SQL databases and the Web Storage System

SQL Database

Web Storage System

Relational database

Similar to object database

Structured data

Semi-structured data

Tables

Folders (and content classes)

Columns

Content classes and properties

Fixed rows

No fixed rows

Focusing on business intelligence

Collaboration

Transaction-centric

Document-centric

Data integrity: primary/foreign keys

No primary/foreign keys

Rectangular shape of data

Nonrectangular shape

Triggers

Event sinks

Stored procedures

No comparable entity (an event is similar)

If your existing KM solutions are getting data from an SQL relational database and the data is typical nonrectangular, semi-structured data, you might want to consider migrating this data from SQL to a Web Storage System.

Use the following guidelines to migrate data from SQL to a Web Storage System:

  • First, map SQL columns to properties and SQL tables to folders and content classes.

  • Identify primary keys and foreign keys of tables. Find their corresponding content classes from the logical design model.

  • Simulate primary keys by identifying a set of properties that can produce a unique instance of an item.

  • Consider inheritance of content classes.

Physical Design Considerations

So far, we have discussed best practices for designing user services, business services, and data services. In this section, we will concentrate on design considerations specific to the Web Storage System during the physical design phase.

For each design consideration topic, we will introduce important concepts, discuss some of the trade-offs you can make, and present a checklist for you to use in considering different options.

Security Model

This section provides an overview of the Web Storage System security model. In addition to using a MAPI client (such as Outlook) or Windows file system APIs to control security settings, you can control access to an item and its properties by using an XML-based security descriptor. Using Web Storage System security descriptors, you can:

  • Both grant and deny a trustee (a person with credentials who is accessing the item) access rights to an item and its properties.

  • Identify trustees using a Microsoft Windows® security identifier (SID).

  • Set, retrieve, and modify the descriptor in XML format.

  • Access the descriptor using both the Microsoft Exchange OLE DB (ExOLEDB) provider and the HTTP/WebDAV protocol in XML format.

Each item's security descriptor is accessed through the item's http://schemas.microsoft.com/exchange/security/descriptor property. This property is the item's descriptor in XML format. The descriptor is physically stored and replicated in an Exchange 2000 Server-specific binary format, which is internally based on the standard Windows 2000 descriptor format. If an item in a folder has no specific security descriptor, the default rights specified in the parent folder apply to the item.

A security descriptor is a data structure that consists of, among other things:

  • An owner field that contains the security identifier (SID) of the owner of the object with which the security descriptor is associated

  • A discretionary access control list (DACL) field that specifies who has access to the object field.

  • A system access control list (SACL) that specifies what actions are auditable by the system.

An access control list (ACL) consists of one or more access control entries (ACEs); each ACE specifies access rights for a security principal. A security principal in Windows 2000 can be either a user or a group of users. Exchange 2000 defines a new security principal, called a role. A role is a named collection of security principals (users or groups) that can be referred to within an ACE in an ACL. The key difference between roles and Windows 2000 groups is that Exchange security roles are defined and stored on the object itself. That is, they can be created and populated with members without requiring a privileged directory service operation.

This feature is important for applications that do not require that specific Windows 2000 groups exist prior to deployment. Security roles are created and stored in the Web Storage System independent of the Windows 2000 directory service, with two distinct advantages for application developers:

  • Creation of roles does not require privileged Active Directory™ operations—a key requirement for departmental scenarios.

  • Because roles are scoped to particular folders (or to applications that are also scoped to folder hierarchies), their names are required to be unique only within the scope of the folder. Therefore, two applications deployed on a single computer running Exchange Server do not require unique role names and memberships.

Roles act as placeholders in applications because they are only referred to in the application at design time. They are evaluated at run time, so application developers can defer the population of these roles until the application is deployed. This means that application developers need not recompile their applications for each deployment instance. As noted above, roles can be used anywhere in the Web Storage System where an ACL can be set, that is, on folders and items.

Security roles are implemented by populating a user-defined "role membership" property on an item or folder (object or parent object) with a list of security principals (users, groups, or roles). In particular, the list consists of security identifiers (SIDs) where these SIDs represent the security principals and form the membership list for a given role. A role SID differs from a Windows 2000 SID in that Exchange Server, rather than Windows, is the security authority for it. Role SIDs are self-contained structures that do not contain any Windows 2000 security-specific information and hence can be used across domains. A role SID encodes two pieces of information:

  • Property. A role property contains the list of SIDs to which the ACE applies. The SIDs in this list can be both Windows 2000 SIDs and role SIDs.

  • Scope. This information indicates where to read the role property.

For additional information about security roles, please refer to Web Storage System Security Roles.

When designing a security model for a KM solution with a Web Storage System, you should consider the following list:

Identify security requirements from the conceptual and logical design models

  • Users—their roles and content access requirements

  • Business requirements, such as privacy and other legal requirements

  • External accessing

The term "role" here implies business role, not the Exchange role that we discussed earlier. The idea is to gather as much information as possible before you start to define the security model.

Define security policy

  • Group users according to business requirements

  • Define user roles

  • Define the application security model—for example, who can create events or workflows

  • Define the content security model—for example, item-level security, property-level security

  • Define external access—for example, firewalls and integration with public key infrastructure (PKI)

Performance

Performance characteristics of KM solutions typically differ from those of online transaction solutions. Instead of calculating numbers quickly, KM solutions focus on performance characteristics such as:

  • Response time for returning search results or locating specific content.

  • Response time for creating, categorizing, and indexing content.

  • Response time for handling a business process (such as an approval process).

As discussed earlier, the Web Storage System offers powerful features and flexibility for building KM solutions. To take advantage of these features you need to pay attention to the performance impact of how data is accessed and processed.

Follow these guidelines to obtain optimal performance:

Use search folders for frequently executed deep traversal searches

Using search folders can greatly increase your application's performance if you have a hierarchical folder structure and your application must perform a deep traversal search of the hierarchy. You might also want to set up search folders as a way to break up large folders into (logical) smaller folders.

For example, consider a KM application that keeps track of various project documents generated by employees in a company. Initially there are tens of thousands of documents to track and these documents are organized in a hierarchy by subject matter. Documents are frequently added to or removed from the system on any day. Through the course of a regular business day employees frequently search against all of the documents in the store to find documents written by a particular set of authors. Doing these searches against the complete hierarchy is very expensive due to the large number of records and folders that the search must navigate through. In this case the expensive search of all of the records can be done once and the results can be used to populate a search folder.

The search folder now contains all of the documents in the store that were written by these authors. When documents are added to or removed from the store (and the hierarchy) the Web Storage System updates the search folder when necessary. When employees want to search against the documents in the store, they search against the search folder instead of against the hierarchy. This way they can search against all of the relevant records but the search doesn't have to navigate through the entire hierarchy.

Using search folders will result in a slight performance decrease for adding/updating/removing items that exist within the scope of the query used to create the search folder. This delay is due to the search folder updates that must be done after each operation. For this reason, it is recommended that search folders only be used for frequently run queries where the data being searched against is updated infrequently.

Index properties that are frequently searched

Property-level indexing is the best way to improve the performance of frequently executed searches. Changing the cost of evaluating any expression in the where clause that makes use of the indexed property would improve the performance of the search by one order of magnitude. For information on creating indexes see the "Property Indexing" topic in the Exchange 2000 SDK.

Property-level indexing will only help the performance of searches that use the indexed property in the where clause of the search. When using property-level indexes, applications will experience slight performance degradation (approximately one percent per index) when inserting/updating/removing items from the folders that have had indexes created. This decrease is due to the cost of updating the index information for that folder to reflect the update operation. Due to this issue, to maximize application performance indexes should only be created on properties that are frequently searched against.

Only do a "SELECT *" operation when it is absolutely necessary

Executing a "SELECT *" operation requires the Web Storage System to look up the schema for the item being searched against to know which set of properties to return. The schema calculation can be expensive and can end up being a large part of the total cost of the search request. The application can avoid this extra cost by requesting only the columns that it needs.

For example, if an item has an associated custom schema that contains the properties "DAV:displayname" and "DAV:lastmodified," the first search below will execute much faster than the second search, although they both will return the same data.

  • "SELECT "DAV:displayname", "DAV:lastmodified" from scope('SHALLOW TRAVERSAL OF "http://myserver/public"')

  • "SELECT * from scope('SHALLOW TRAVERSAL OF "http://myserver/public"')"

  • Do a hierarchical traversal when searching only against folders.

Applications can improve the performance of a search by specifying that the search should use hierarchical rather than deep traversal if searching against folders. For example, the searches below will both return the same resources but the first search below will return faster and use fewer server resources than the second search.

  • "SELECT "DAV:displayname" from scope('HIERARCHICAL TRAVERSAL OF "http://myserver/public"')

  • "SELECT "DAV:displayname" from scope('DEEP TRAVERSAL OF "http://myserver/public"') WHERE "DAV:iscollection" = true

Specify multiple shallow scopes if possible, rather then executing a deep traversal search

It is more efficient to do multiple shallow traversal searches instead of doing one deep traversal search. Deep traversal searches require the Web Storage System to lock off the hierarchy being searched (to avoid the hierarchy changing while the search is executing), but a shallow traversal search doesn't have this constraint. When constructing the search requests you can specify multiple scopes in the query. All of the scopes listed must be of the same scope (that is, either all deep or all shallow). See the "Search Scope" topic in the Exchange 2000 SDK for more information.

Use synchronous events sparingly

Synchronous events are a powerful tool to use when creating Web Storage System applications, but they also can have a dramatic performance impact. When a synchronous event is executing against a given resource all other operations against that resource are blocked until the event has completed. Where possible, use asynchronous events instead. Asynchronous events don't provide all of the functionality of synchronous events, but they have the advantage of not serializing access to the resources on which they act.

Performance tune COM+ components used with events

When implementing events with COM+ components, you should follow normal practices of performance tuning the COM+ components.

Scalability and Availability

Exchange 2000 offers a great deal of improvement in scalability and availability over the previous version of Exchange Server. The Exchange 2000 features that contribute to scalability and reliability of applications include:

  • Multiple stores and storage groups, which reduce backup and restore time, and extend both scalability and reliability.

  • Active/active clustering, which focuses on improving reliability and accessibility for KM applications.

  • Network load balancing, which represents another form of clustering that focuses on distributing network traffic across multiple servers rather than concentrating on safeguarding accessibility in case of server failure (as in active/active clustering).

  • Distributed front-end/back-end server configuration architecture, which allows for partitioning services onto multiple servers and, in the process, allows Exchange to scale up to meet the needs of large scale enterprises, ISPs, and ASPs.

  • Use of Windows 2000 Active Directory for security, which addresses centralized management along with reliability.

  • Web Storage System, which through its reliance on storage groups, multiple databases, clustering, and native MIME storage, offers both scalability and reliability combined with IIS integration and rapid delivery of streaming media files. In addition, Exchange takes care of replication of stores, which can provide additional availability if necessary.

For further information about the preceding features, please refer to Exchange 2000 online documentation and the following white papers:

  • Microsoft Exchange 2000 Clustering

  • Exchange 2000 Front-end and Back-end Topology

  • Best Practices for Developing ASP-Hosted Exchange Services

Guideline Review

In summary, the following is a list of guidelines for designing KM solutions with Exchange 2000/Web Storage System:

Create and partition multiple stores for different applications

Exchange 2000 now supports the ability to host multiple public folder trees, or top-level hierarchies (TLH). In addition, Exchange 2000 replicates each tree to only a single public folder store per server. Because of this more limited replication, administrators can now restrict selected sets of public folders to individual servers for better control and to improve the availability of applications running on these stores.

Set up and protect transaction logs on a separate drive

To ensure fault-tolerance and the ability to recover stores even after a server failure, Exchange 2000, like earlier versions, relies on transactions recorded in a transaction log file. Transaction logs serve as failsafe intermediaries between memory and databases on disk. We recommend the following set of best practices in setting up and managing transaction logs and databases:

  • Protect the drives against hardware through a means such as disk mirroring (RAID) to insure against possible data loss.

  • Keep the transaction logs and databases (stores) on separate drives.

  • Maximize performance by making the number of transaction log drivers per server equal to the number of storage groups, and place each transaction log set on a separate spindle.

  • Always format the file system for NTFS.

  • Leave circular logging turned off.

Set up front-end servers for processing incoming protocol requests and back-end servers for Web stores

In this type of configuration, front-end servers can be dedicated to handle incoming client connections and protocol requests. Back-end servers can be dedicated to the job of managing the databases (stores). Not surprisingly, this ability to spread the load across multiple servers accounts for the capacity of Exchange to scale upward to meet the needs of millions of users. In addition, this separation of operations also contributes to the reliability of the overall system. The vital components are no longer locked onto one of a few servers.

Use clustering and Network Load Balancing services for availability

As discussed above, clustering is the practice of grouping servers in such a way that—even though they are separate, independent computers—they appear to the network as a single unit. They work cooperatively to ensure that if one fails, another is ready to take over and continue providing service. In Exchange 2000, clustering is active/active, meaning that Exchange services can run simultaneously on all the servers in a cluster. If one fails, another takes over the failed server's responsibilities, as well as continuing to handle its own.

Note: Clustering requires either Windows 2000 Advanced Server or Windows 2000 Datacenter Server.

The Windows 2000 Network Load Balancing service is primarily a software-based load-balancing solution as opposed to other type of hardware-based solutions. The Network Load Balancing service is designed to grab incoming TCP/IP traffic and distribute it evenly across the servers in a load-balancing cluster. The cluster is assigned a single IP address (or a set of addresses if the host is multi-homed, that is, connected to multiple networks). In the event that a host fails or is taken offline, the load balancing service redirects traffic to working hosts and, because the client will retry when a connection is lost, the end user experiences no more than a second or two of delay before a working server responds.

Design events and workflow processes with availability in mind

To address the application availability requirement—keeping the application running in the event of nonfatal errors—you should pay attention to error handling, particularly in event sinks, COM+ components, and workflow. The idea is to recover most of the nonfatal errors without losing service.

Implementation of Taxonomy

Managing and implementing taxonomy is one of the key tasks that KM solutions need to perform well. The term taxonomy refers to categorization and organization methods of the content, typically a core service within the content management module of a KM solution.

There are multiple approaches to implementing taxonomy with a Web Storage System. Here we discuss three approaches with some pros and cons.

Active Directory

You can extend the Active Directory schema to contain taxonomy, such as content information related to users or employees and organization or geography information. This approach works well with less-frequently updated content information, which typically applies to the whole enterprise. It also has the benefit of using the Active Directory infrastructure. On the other hand, the disadvantages of this approach include:

  • It does not work well with frequently updated content information or with specific information that does not apply to the whole enterprise.

  • It will be difficult to manage the Active Directory schema extension if the taxonomy changes.

  • There is extra overhead for replicating the Active Directory schema extensions throughout the enterprise.

Folder and Content Classes

In this approach, you define content classes and an appropriate folder structure (either a hierarchy or a flat folder structure as discussed earlier) to store the content information. This approach works well with frequently updated content information. However, you may want to pay attention to a the following potential pitfalls for this approach:

  • Performance for retrieving content information is heavily dependent on how you index the custom properties and on how you design SQL queries, as discussed in the Performance section.

  • In some cases, changing the taxonomy or replacing the taxonomy with a new one would be a difficult task that might require folder structure changes.

Search Folder

The search folder approach is similar to the content class approach except that the taxonomy is stored in search folders. As discussed earlier, search folders are dynamic and their contents are constantly reevaluated and updated by the Web Storage System. In other words, search folders contain pointers (similar to symbolic links) to the physical objects.

Integration with Line-of-Business Applications

Real-life KM solutions must integrate with other line-of-business (LOB) applications. For example, a KM solution needs to obtain source content and data from other LOB applications, such as human-resources applications or manufacturing-process applications. In another scenario, a KM solution might be required to post a specific request to another Internet Web site, which serves as a virtual community for a particular industry.

Cc750184.dsgnkm10(en-us,TechNet.10).gif

Figure 10: Conceptual diagram for integration with LOB applications using XML schema

There are many ways to integrate with LOB applications, depending on the technologies and platforms involved. Here, we advocate an XML-based approach to address integration needs. This approach is based on the fact that the Web Storage System supports XML natively and is well suited for XML-based interfaces with other LOB applications.

The following tasks are recommended for this XML-based approach:

Understand the business requirements for interfacing with other LOB applications

The first step, as in any design process, is to understand other LOB applications and the interface requirements to integrate your KM solutions with these applications.

Define an XML schema

There are two levels of XML schema design:

  1. The internal XML schema that you use for your KM solution. This schema design is mostly related to how you design the store schema and other design considerations discussed previously.

  2. The external XML schema for interfacing with other LOB applications. This schema design should focus on what the other LOB applications can handle with the XML schema.

Use BizTalk XML Framework and/or BizTalk Server Orchestration

If the other LOB application supports certain industry-specific standards, you can probably find one of the predefined XML schemas in BizTalk Framework (see http://www.microsoft.com/biztalk for additional information). Alternatively, you can use the new BizTalk Server 2000 business orchestration process features to tackle the integration task.

For additional information about BizTalk Server 2000, see the Microsoft BizTalk 2000: Building A Reverse Auction with BizTalk Orchestration article.

Use XSLT to transform XML schema

BizTalk Server 2000 can transform one XML schema to another XML schema for you. However, if you decide to do it yourself, XSLT (T stands for Transformation) is a perfect tool for doing the transformation. There is a great deal of XML and XSLT information on the MSDN XML Developer Center.

Use WebDAV to post the XML document to Internet or intranet Web sites

If you are not using BizTalk Server and want to post an XML document to another Web site, you should use WebDAV to do so.

The following Visual Basic® Scripting Edition (VBScript) code snippet illustrates posting an XML document to another Web site (note that it does not show the security setting).

Set objRequest = CreateObject("MSXML2.ServerXMLHTTP") ' 
strURL = strCommunityURL & "AutoMan-" & strRFQnumber & ".xml" 
objRequest.setRequestHeader "Translate","f" 
objRequest.setRequestHeader "Content-Type", "text/xml" 
objRequest.Send objXmlDoc

Define the security model for interfacing with other LOB applications

Typically, different LOB applications have different security requirements. It will be a straightforward process if both applications are based on Windows 2000 Active Directory. However, in some cases, there are mixed-platform and various security models. An approach to address this mixed-environment issue is to introduce wrappers and an intermediate XML schema for describing the security models.

Determine an error handling process

The error handling process is typically overlooked during the integration process. You need to handle your own application's errors and expect the other applications to handle theirs. However, it would be better if you anticipate the worst-case scenario, that is, that the other application fails and no error message is returned.

In addition, if you have certain control of the other LOB application, you can define an XML schema describing how both sides handle error scenarios and the proper procedures to interact with error conditions.

Identify performance bottlenecks and conduct performance tuning

After you integrate successfully with another LOB application, you might want to conduct performance tests to see if it meets your performance requirements. If there is a problem, you can carry out performance tuning as follows:

  • Conduct performance tests.

  • Identify performance bottlenecks.

  • Devise a solution to remove the performance bottlenecks.

  • Modify the schema design or other parts of the design if necessary.

  • Repeat the tuning process until you meet the performance targets.

Conclusion

The Web Storage System offers many new development features for building KM solutions. In this article, we have examined the characteristics of KM solutions, the MSF service-based design model and design process, best practices for design, and specific design considerations. The intent is to guide developers in using some successful ways to design KM solutions with a Web Storage System.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft