Export (0) Print
Expand All

BizTalk Server 2004 Performance Characteristics

Microsoft Corporation

March 2005

Applies to: BizTalk Server 2004

Summary: This document provides information about the performance characteristics of key Microsoft BizTalk Server 2004 configurations and components, such as messaging, pipeline, and orchestration. (52 printed pages)

This document provides information about the performance of Microsoft BizTalk Server 2004. The BizTalk Server product team derived the performance characterization from thousands of test cases that isolated and measured the performance of individual configurations and components of BizTalk Server 2004. This document presents the results of this testing and explains some of the significant findings. It does not provide any specific instructions about optimizing a particular BizTalk Server 2004 deployment.

BizTalk Server 2004 is a server application that enables businesses to integrate disparate applications and automate business processes by leveraging Extensible Markup Language (XML) standards. BizTalk Server 2004 is also an open-standards platform with simplified tools that enable business users and developers to solve complicated business problems through integration with partners, Web services, and other business-process management systems. BizTalk Server 2004 handles all data communication between the underlying Microsoft Windows platform and Microsoft SQL Server™ database on behalf of all external applications, services, processes, and systems. The following applications commonly use BizTalk Server 2004:

  • Enterprise application integration (EAI)
  • Business-to-business (B2B) commerce
  • Business process management (BPM)

BizTalk Server 2004 supports various vertical industries through solution accelerators, including:

  • Manufacturing, through BizTalk Accelerator for RosettaNet
  • Financial, through BizTalk Accelerator for Financial Services
  • Financial, through BizTalk Accelerator for the Society for Worldwide Interbank Financial Telecommunication (SWIFT)
  • Health care, through BizTalk Accelerator for Health Level 7 (HL7)
  • Health care, through BizTalk Accelerator for the Health Insurance Portability and Accountability Act (HIPAA)

For information about how BizTalk Server 2004 supports these vertical industries, go to http://go.microsoft.com/fwlink/?linkid=28026. In addition, BizTalk Server 2004 includes various application and technology adapters that provide enhanced interoperability, including:

  • BizTalk Adapter for FTP
  • BizTalk Adapter for MQSeries
  • BizTalk Adapter for SQL Server
  • BizTalk Adapter for Web Services

A variety of adapter-development and application-vendor partners have released other adapters that work with BizTalk Server 2004. For information about the various application and technology adapters available for BizTalk Server 2004, go to http://go.microsoft.com/fwlink/?linkid=28027.

To enable the integration of disparate applications and the coordination of logic between business processes, BizTalk Server 2004 transforms and persists all messages in the MessageBox database on SQL Server. Based on the receive adapter that accepts messages from external applications, services, processes, and systems, BizTalk Server uses receive pipelines to convert messages from their external format to XML data. After BizTalk Server processes the messages with orchestrations (or routes them for request messaging), it uses send pipelines to convert the XML data to their external format. Then the send adapter sends the messages to their external applications, services, processes, and systems.

The following figure shows the message flow in BizTalk Server 2004.

Figure 1 Message flow in BizTalk Server

ms864801.Local_1210750642_tdi_ha_btscomps(en-US,BTS.10).gif

BizTalk Server 2004 provides two main functions: The first is messaging with external applications, services, processes, and systems, and the second is the internal processing of orchestrations. Messaging involves receiving, parsing, routing, and sending messages through user-configured adapters and pipeline components. Orchestration involves capturing business logic and associating it with other applications, services, processes, and systems. The Business Rule Engine provides an interface for business users to control orchestrations by expressing rules and defining business process activities.

The messaging architecture in BizTalk Server 2004 allows administrators, developers, and business users to update various aspects of a BizTalk Server solution (such as a deployment configuration, application component, or business rule) without disrupting service for external applications, services, processes, or systems. Essentially, BizTalk Server 2004 eliminates this downtime by processing messages asynchronously. While this decoupling of messages from origination to destination introduces some latency, it provides flexibility and availability for the integration of different applications, services, processes, and systems.

This document provides performance information about the key run-time components in BizTalk Server 2004, including:

  • The scalability of the message box
  • Performance characterization of orchestrations with varying complexity
  • Performance characterization of various transport adapters (such as the File, HTTP, and the Web services adapter)
  • Performance characterization of various pipeline components (such as the XML assembler/disassembler, SMIME/MIME, and flat-file parser)
  • Performance characterization of rule processing with the Business Rule Engine

This document also provides performance information about elements in BizTalk Server 2004 that are not directly related to the run-time components. These other elements include interchanges (large messages containing multiple individual messages) and tracking (application and process monitoring).

Document Objective

This document provides a general indication of how certain configurations and components perform in BizTalk Server 2004. It includes tables, graphs, and charts that show the performance characteristics of key BizTalk Server 2004 components. It describes how these factors affect performance at an individual component level, and provides background information about how the tests were conducted.

By showing you the relative performance of these individual components, this document aims to guide your design decisions so that you can modify your application design or deployment configuration to improve performance. Do not interpret the performance characteristics presented in this document as benchmark measurements that all BizTalk Server 2004 deployments can support.

This document does not describe how individual features, components, or configurations impact the overall performance of any specific deployment or scenario. This document is intended to be a descriptive guide only; it does not provide prescriptive information or recommendations for optimizing a particular BizTalk Sever 2004 deployment or scenario.

Intended Audience

This document is intended for anyone who uses, or plans to use, BizTalk Server 2004. Specifically, this document is aimed at technical professionals who design, develop, or deploy applications and solutions based on BizTalk Server 2004. These professionals include:

  • Developers
  • Business users
  • Application designers
  • Technical sales staff and consultants
  • Systems integrators and analysts
  • Network engineers and technicians
  • Information technology (IT) professionals

This document assumes that readers have some experience with BizTalk Server 2004, or are familiar with emerging application integration and business process management technologies and standards. Readers should be familiar with the concepts and topics presented in the BizTalk Server 2004 product documentation, which is updated quarterly and can be accessed at http://go.microsoft.com/fwlink/?linkid=28326.

This document is not intended for users who require assistance with using a particular feature or tool in BizTalk Server 2004; it does not contain procedures for configuring specific settings in BizTalk Server 2004, and it does not prescribe steps for deploying a particular BizTalk Server 2004 solution.

Document Overview

This document has three sections: Introduction, key performance considerations, and component performance. The introduction provides an overview of this document and sets a context for the information that it contains. The key performance considerations section describes the primary factors that impact BizTalk Server 2004 performance, presents a request broker scenario, and provides performance guidelines for BizTalk Server and SQL Server. This section also describes the scalability of BizTalk Server 2004.

The component performance section shows the performance of key components in BizTalk Server 2004 through tables, graphs, and charts. It shows the performance characterization for the following elements in BizTalk Server 2004:

  • Orchestrations
  • Transport adapters
  • Pipeline components
  • Schema complexity
  • Interchanges
  • Tracking
  • Business rules

The component performance section also includes deployment diagrams that show the server hardware and network implementation, and scenario descriptions that explain how each series of tests was run.

While features and components may change between versions, this document reveals the overall performance improvements in BizTalk Server 2004 as compared with previous versions. The next version of BizTalk Server will perform even better as the product team further refines product design and performance.

This section describes the primary factors that impact Microsoft BizTalk Server 2004 performance and presents a request broker scenario that provides a context for the supporting test cases and statistics. It also provides information about the scalability of BizTalk Server 2004 with general guidelines for optimizing performance, and test results from scaling BizTalk Server 2004. Ultimately, improving performance requires finding the right balance between receiving and processing rates, and minimizing message accumulation in the application queue.

Because BizTalk Server 2004 can be used for such a wide variety of applications, it can be deployed in an infinite number of configurations. Due to the flexibility and sophistication of BizTalk Server 2004, it is highly unlikely that any two BizTalk Server 2004 customers will have exactly the same scenario, application design, and deployment configuration. Without proper knowledge of how the features and components in BizTalk Server 2004 perform, it can be difficult to ensure that your deployment configuration and application design are efficiently leveraging the components of BizTalk Server 2004.

Primary Performance Factors

The following factors have the highest impact on BizTalk Server 2004 performance (in no particular order):

  • Message size
    While BizTalk Server 2004 imposes no restriction on message size, practical limits and dependencies might require you to minimize the size of your messages because large messages require more resources for processing. As message size increases, overall throughput (messages processed per second) decreases. Consider the average message size, message type, and number of messages being processed by BizTalk Server 2004 when designing your scenario and planning for capacity. Do not use unnecessarily long attribute and tag names; if possible, keep the length under 50 characters. For example, do not use a 200-character tag name for a message size of only 1 byte.
  • Orchestration complexity
    The complexity of orchestrations has a significant impact on performance. As orchestration complexity increases, overall performance decreases. An infinite variety of customer scenarios use orchestrations, and each scenario might involve different orchestrations of varying complexity. Test your orchestrations and make sure they perform optimally before implementing them in your deployment. For example, use atomic scopes in your orchestration.
  • Schema complexity
    The throughput for message parsing (especially flat-file parsing) is affected by the complexity of the schemas. As schema complexity increases, overall performance decreases. When designing schemas, keep them simple (for example, reduce the length of node names) and make sure they perform optimally to improve overall performance (for example, move promoted properties to the top of the schema to reduce retrieval time).
  • Map complexity
    Map transformation can be resource intensive depending on the complexity of the maps. As map complexity increases, overall performance decreases. To improve overall performance, minimize the number of fields in your maps.
  • Pipeline components
    Because pipeline components have a significant impact on performance (for example, a pass-through pipeline component performs up to 30 percent better than an XML assembler/disassembler pipeline component), make sure that any custom pipeline components perform optimally before implementing them in your deployment. For example, you can improve overall performance by reducing the message persistence frequency (number of database trips) in your pipeline component, and using proper programming techniques (write high-quality code with minimal redundancy).
  • Tracking data
    The amount of data you track can have a significant impact on performance. As the number of items tracked and the amount of tracking data increases, overall performance decreases. Run the tracking service to move tracking data from the MessageBox database to the Tracking database (BizTalkDTADb). Monitor the Tracking database for disk consumption and growth. Archive and clean up the Tracking database regularly (move, back up, and delete old files).
  • Message-persistence frequency
    BizTalk Server commits write operations to the MessageBox database, which stores messages and state information for particular message instances. This persistence allows BizTalk Server 2004 to ensure data integrity, security, and reliability. As message-persistence frequency (the number of times that data needs to be persisted) increases, overall performance decreases. Whenever possible, reduce the frequency at which data needs to be persisted to the message box. For example, group multiple message-persistence points into one scope.
  • Transport adapters
    While scenarios define the particular transport adapters that are required, different transport adapters have different settings that you can modify to improve performance. For example, separate orchestration and adapter functionality into separate BizTalk Server hosts to minimize resource contention. Plan to accommodate the potential security threats associated with separating BizTalk Server elements into different hosts.

Host configuration is also a significant factor for performance. BizTalk Server 2004 supports the isolation of individual components (including orchestrations, adapters, and pipelines) into separate hosts, which are logical entities with specific functionality. For more information about hosts, see Creating Scalable Configurations in the BizTalk Server 2004 product documentation.

Performance Guidelines for SQL Server

Consider the following performance guidelines when configuring Microsoft SQL Server™ with BizTalk Server 2004:

  • Whenever possible, use a fast disk subsystem with SQL Server. Use a redundant array of independent disks type 5 (RAID5) or a storage area network (SAN) with backup power supply.
  • Use the Backup BizTalk SQL agent job to back up your databases regularly. The BizTalk Server service automatically recovers from SQL Server connection malfunctions.
  • Isolate each message box onto a separate server from the Tracking database. For smaller deployments, isolating the message box onto a separate physical disk from the Tracking database might be sufficient.
  • The primary message box could be the bottleneck due to CPU processor saturation or latency from disk operations (average disk queue length). If CPU processing is the bottleneck, add CPU processors to the primary message box.
  • If disk operations are the bottleneck, move the Tracking database to a dedicated SQL Server computer or disk. If CPU processing and disk operations on the primary message box are not the bottleneck, you can create new MessageBox databases on the same SQL Server computer to leverage your existing hardware.
  • Follow SQL Server best practices to isolate the transaction and data log files for the MessageBox and Tracking databases onto separate physical disks.
  • Allocate sufficient storage space for the data and log files; otherwise, SQL Server will automatically expand the space when the log files are full. Initial size of the log files will depend on the specific requirements in your particular scenario. Estimate the average file size in your deployment and expand the storage space before implementing your solution.
  • Allocate sufficient storage space for high-disk-usage databases, such as the MessageBox, Tracking, and Business Activity Monitoring (BAM) databases. If your solution uses the BizTalk Framework (BTF) messaging protocol, allocate sufficient storage space for the BizTalk Configuration database (BizTalkMgmtDb).
  • Periodically, depending on business needs and the volume of data processed in your particular scenario, archive and clean up the Tracking database. The size of this database can degrade performance (especially when one Tracking database supports multiple message boxes) because reaching the full capacity of the database imposes a limit on the rate of data insertion.
  • Scale up the servers hosting the MessageBox and Tracking databases if they are the bottleneck. You can scale up the hardware by adding CPU processors, including additional memory, upgrading to faster CPU processors, running on 64-bit SQL Server, and using high-speed dedicated disks.

Performance Guidelines for BizTalk Server

While the default settings in BizTalk Server 2004 provide optimal performance for many hardware and software configurations, it might be beneficial in some scenarios to modify the settings or deployment configuration. When configuring BizTalk Server 2004, consider the following performance guidelines:

  • To prevent resource contention, isolate messaging, orchestration, and tracking onto separate hosts. To further minimize contention, isolate the tracking service and all transport adapters onto separate hosts.
  • If CPU processing on the BizTalk Server is the bottleneck, scale up BizTalk Server by including additional CPU processors or upgrading to faster CPU processors.
  • To decrease HTTP request-response latency, define a DWORD registry key named HttpBatchSize and set the value to 1. Create this registry key in the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BTSSvc.3.0\HttpReceive folder.
    This setting specifies the batch size that the HTTP receive adapter uses to submit requests to BizTalk Server. Ordinarily, the HTTP receive adapter waits to accumulate multiple messages to submit at once. It submits messages to BizTalk Server when the maximum batch size is reached, or at the preset waiting time periods. Setting the HttpBatchSize value to 1 causes the HTTP receive adapter to submit messages as soon as they are received.
  • To decrease end-to-end latency, reduce the MaxReceiveInterval value in the adm_ServiceClass table of the BizTalkMgmtDb database from the default value of 500 to a value less than 100 (or any positive integer) for the following service classes:
    • XLANG/s
    • Messaging In-Process
    • Messaging Isolated
    These settings specify the maximum polling interval (in milliseconds) at which the messaging agent polls the message box. Microsoft does not support the direct modification of these values; you must use the tool at http://go.microsoft.com/fwlink/?linkid=30076.
  • To adjust the performance characteristics of the BizTalk Server engine, modify the following columns in the adm_ServiceClass table of the BizTalkMgmtDb database:
  • HighWatermark, LowWatermark. These two settings determine the outbound processing rate for messages. They represent high and medium stress-level thresholds, respectively. Both settings define the number of messages processed by BizTalk Server 2004, but not yet consumed by subscribers. When BizTalk Server processes more messages (not yet consumed) than specified by the HighWatermark threshold, it stops processing messages from the message box until the number of active messages decreases below the LowWatermark threshold.
  • HighMemorymark, LowMemorymark. These two settings control the memory thresholds at which BizTalk Server starts and stops processing messages. Both settings define the percentage of overall memory consumed. They affect both inbound and outbound throughput. When BizTalk Server memory consumption reaches the level defined by the LowMemorymark threshold, BizTalk Server increases the stress level. If memory consumption reaches the level defined by the HighMemorymark threshold, then BizTalk Server stops processing messages until memory consumption is reduced.
    These settings also have an impact on orchestrations. BizTalk Server stops creating new orchestrations when the memory consumption reaches the HighMemorymark threshold. BizTalk Server resumes creating new orchestrations when the memory consumption reaches the LowMemorymark threshold.
  • HighSessionmark, LowSessionmark. These two settings determine the inbound processing rate for messages. They represent high and medium stress-level thresholds, respectively. Both settings define the number of parallel database sessions that are persisting messages to the message box. When the number of sessions specified by the HighSessionmark threshold is exceeded, BizTalk Server blocks incoming messages until the number of sessions decreases below the LowSessionmark threshold.
    The following table shows the default values for these threshold marks.

Table 1 Default values for threshold marks

Threshold Component Default Values

Watermark

Messaging In-Process (File, MSMQ)Messaging Isolated (HTTP, SOAP)MSMQT

LowWatermark—100HighWatermark—200

-------

Orchestration

LowWatermark—10HighWatermark—20

Memorymark

Messaging In-Process (File, MSMQ)Messaging Isolated (HTTP, SOAP)MSMQT

LowMemorymark—90HighMemorymark—100

-------

Orchestration

LowMemorymark—100HighMemorymark—100

Sessionmark

Messaging In-Process (File, MSMQ)Messaging Isolated (HTTP, SOAP)

LowSessionmark—5HighSessionmark—10

-------

MSMQT

LowSessionmark—12HighSessionmark—15

  • To control the number of threads per CPU processor that BizTalk Server uses to process incoming message batches, define a DWORD registry key named MessagingThreadsPerCpu. Create this registry key in the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BTSSvc.3.0 folder for all isolated host processes or the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BTSSvc{guid of the host} folder for in-process hosts. Larger numbers increase the CPU processor utilization in the receive host. Smaller numbers might improve performance if there is excessive context switching in the receive host.

Request Broker Scenario

Before presenting the performance data and statistics, this document describes a request broker scenario to provide a context for the supporting test cases and statistics. Not all tests used this specific scenario, but all of them used some variation of it. Of course, the tests involving orchestrations included orchestrations in the scenario. The Component Performance section later in the document provides scenario descriptions that briefly explain the basic variations.

A request broker in BizTalk Server 2004 accepts requests, routes them to one or more proxy handlers (representing individual departments of a company, for example), and manages the state of all requests in a consistent manner. BizTalk Server 2004 masks the complexities behind integrating disparate applications and systems such that clients simply submit requests and track their status. This concept is similar to a translator who facilitates communication between people who speak different languages. Without the translator, these people would not have a way to communicate or conduct business, or they might invest time and money to learn another language.

A request broker interfaces with the individual proxy handlers and processes asynchronous requests from clients to back-end systems. It presents a single interface with a single message type, accepts requests from clients, determines how to route each request to the appropriate back-end system, and transforms the requests into the format required by the back-end system. It also manages the life cycle of the requests and tracks their status.

For example, a company with many departmental systems might provide online services to extranet clients. These back-end services can include human resources, payroll, and customer resource management systems. Because each department maintains its own system, all of the departments can have different transport protocols and messaging interfaces. A request broker enables, manages, and simplifies the interaction between these disparate systems. The different back-end systems can have various resources and response times. The following figure shows the deployment topology for the request broker scenario.

Figure 2 Deployment topology

Deployment topology

Each BizTalk Server has the following hardware configuration:

  • Dual 2.8 gigahertz (GHz) Xeon processors (hyperthreaded)
  • 2.0 gigabytes (GB) of RAM
  • 170 GB of hard disk space (123 GB free)
    • Five 35.0 GB disks with disk speeds of 15,000 revolutions per minute (RPM)
    • Two spindles per disk

Each SQL Server has the following hardware configuration:

  • Dual 2.0 GHz Xeon processors (hyperthreaded)
  • 4.0 GB of RAM
  • Storage area network (SAN); disable write caching to ensure data integrity
    • Four mirrored disks (2 x 2 spindles for reliability)

The message box for the request broker uses an eight-processor server. Only the message boxes for the proxy handlers are dual processor.

Request Broker Scenario Description

To simulate departmental servers, each proxy handler runs a basic orchestration that receives documents (4 KB each), sends back acknowledgments (less than 1 KB each), waits for a few minutes, and sends back the modified documents to the request broker. The following figure shows the orchestration diagram in BizTalk Server 2004 Orchestration Designer.

Figure 3 Orchestration diagram

Orchestration diagram

In this scenario, all inbound and outbound messages use the File adapter for both request broker and proxy handlers. Three hosts are deployed on the request broker: receive, processing (orchestration), and send. Incoming documents are saved on a file share, which is configured as a receive location on the receive host. Outgoing messages are saved on file shares as well. Proxy handler responses are saved on a shared location that is accessible by both servers of the request broker. The following figure shows the data flow diagram for the request broker scenario.

Figure 4 Data flow diagram

Data flow diagram

The following steps describe the logic in this data flow.

  1. A file is dropped in the receive directory on the communication server. BizTalk Server in the request broker picks up the file.
  2. The BizTalk Server sends the file to the Proxy folder on the proxy server.
  3. The BizTalk Server sends a document to the Transmit folder for notification that the file was sent to the proxy.
  4. The proxy server sends an acknowledgment to the Ack folder on the communication server for notification that the file was received.
  5. The BizTalk Server in the request broker picks up the acknowledgment message.
  6. The proxy server determines whether to accept or reject the file, and sends a reply to the Reply folder on the communication server.
  7. The BizTalk Server in the request broker picks up the reply message.
  8. BizTalk Server sends a message to the Transmit folder with the message details.

This series of tests measures the throughput of the request broker as more servers are added. In addition to scaling the request broker, the proxy servers are checked to ensure that they are not causing bottlenecks during test runs. The following four configurations are tested in this scenario:

  • One request broker with one proxy
  • Two request brokers with two proxies
  • Three request brokers with two proxies
  • Four request brokers with four proxies (two groups of two servers each)

Request Broker Test Results

For these tests, with one, two, or three request brokers, CPU utilization on the BizTalk Server orchestration hosts reached 100 percent, resulting in near-linear scaling of throughput. With one and two BizTalk Server computers, performance improved by about 50 percent. When adding a third or fourth BizTalk Server computer to the scenario, performance improved by 20 percent each. The bottleneck was largely attributed to CPU processor utilization and lock contentions in the message box on SQL Server. The following table shows the performance data from the request broker scenario.

Table 2 Performance data from the request broker scenario

Request Brokers Average Percent CPU Time (BizTalk Server) Average Percent CPU Time (SQL Server) Documents per Second

1

85%

23.9%

17.85

2

71.1%

33.5%

29.77

3

62.2%

48.3%

34.50

4

42.7%

59.6%

35.59

Scalability

This section describes the scalability of Microsoft BizTalk Server 2004 based on test scenarios for scaling up and scaling out. Scaling up involves adding and upgrading hardware (such as CPU and RAM), while scaling out involves adding servers to the deployment.

This section includes tables, charts, and diagrams that describe the performance characteristics of the following scalability scenarios:

  • Scaling up SQL Server (single MessageBox database). This scenario shows the receiving and processing throughput when the CPU processors on SQL Server are scaled up from one, two, four, and eight CPU processors.
  • Scaling out SQL Server (multiple MessageBox databases). This scenario provides general guidance on optimizing message box performance through scaling out SQL Server, and shows the performance impact of various deployments involving different numbers of BizTalk Server and SQL Server computers.
  • Scaling out BizTalk Server in an orchestration scenario. This scenario describes the performance improvements achieved by scaling out BizTalk Server 2004 in an orchestration scenario where multiple BizTalk Server computers (receive and send) are added until the orchestration server becomes the bottleneck.
  • Scaling out BizTalk Server in a messaging scenario. This scenario shows the performance impact of adding multiple BizTalk Server computers to a deployment until the MessageBox database in SQL Server becomes the bottleneck due to lock contentions.

This section describes the performance impact and improvement that results from including additional resources in each scenario. It provides a guideline for modifying server deployment and application design to improve performance. While it is more cost effective to modify the logic in your application design, it is always beneficial to establish a plan for adding resources to accommodate future capacity and scalability requirements.

Scaling Up SQL Server

The MessageBox database in SQL Server can become a bottleneck for BizTalk Server 2004 deployments. Therefore, adding more resources to SQL Server should improve overall throughput. This section shows the receiving and processing throughput when the CPU processors on SQL Server are scaled up from one, two, four, and eight CPU processors.

First, this section shows the deployment topology and describes the hardware configuration for the test scenarios. Second, it provides information about how BizTalk Server 2004 was configured for the scenarios. And finally, it explains the results from the performance testing.

SQL Server Deployment Configuration

To assess the scalability of the MessageBox database, the testing involved one SQL Server computer (hosting the MessageBox database) and multiple BizTalk Server computers. Different tests were run with varying numbers of CPU processors (one, two, four, and eight CPUs) on the single SQL Server. The following figure shows the deployment topology for the scalability testing of the MessageBox database in BizTalk Server 2004.

Figure 5 Deployment topology for MessageBox database scalability testing

<No Change>
Hardware Configuration

All of the performance tests used the same hardware configuration, except for the scenarios that involve scaling up the CPU processors. Each BizTalk Server has the following hardware configuration:

  • Dual 2.8 gigahertz (GHz) Xeon processors (hyperthreaded)
  • 2.0 gigabytes (GB) of RAM
  • 170 GB of hard disk space (123 GB free)
    • Five 35.0 GB disks with disk speeds of 15,000 revolutions per minute (RPM)
    • Two spindles per disk

Each SQL Server has the following hardware configuration:

  • Dual 2.0 GHz Xeon processors (hyperthreaded)
  • 4.0 GB of RAM
  • Storage area network (SAN); disable write caching to ensure data integrity
Scaling Up SQL Server Scenario Description

The objective for this series of tests was to measure the processing speed and throughput of the MessageBox database as CPU processors were added to the SQL Server computer. To show the scalability of the message box with varying numbers of CPU processors on the SQL Server, a single messaging scenario was used in all of the tests (the only variable was the number of CPU processors on the SQL Server).

Because the message box is the engine for message publishing and routing, the messaging scenario used for the performance testing is sufficient for generating workload. Orchestration is unnecessary because it has less-significant performance impact on the message box.

The scenario involves a file receive adapter that receives a purchase order request and routes the order to various departments based on the content of the request. Various files ranging from 2 to 6 KB in size are dropped to the file receive location at periodic intervals. To isolate and measure the performance of the message box on the SQL Server computer, the messaging workload on the BizTalk Server computers was maximized for the duration of the test runs. The following figure shows the data flow diagram for the scalability testing of the message box in BizTalk Server 2004.

Figure 6 Data flow diagram for message box scalability testing

<No Change>
Scaling Up SQL Server Test Results

Scaling up the SQL Server from one to two CPU processors improved end-to-end throughput (documents per second) by 90 percent. End-to-end throughput is the total time from message reception (when a file is dropped in the receive directory) to message transmission (when a file is dropped in the outbound directory). Scaling up from two to four CPU processors improved performance by approximately 85 percent for receiving and 50 percent for processing. When scaling up from two to four CPU processors, two more BizTalk Server computers were added to the deployment configuration to generate additional workload on the four-processor SQL Server that hosts the MessageBox database.

Scaling up the SQL Server from four to eight CPU processors provided a 20 to 30 percent increase in performance. As more BizTalk Server computers are added to the deployment configuration, the message box eventually becomes the bottleneck and SQL Server lock contentions increase. At this point, adding BizTalk Server computers to the deployment does not improve throughput. The following table shows the performance data from the scalability testing of the message box in BizTalk Server 2004.

Table 3 Performance data for message box scalability testing

SQL Server One CPU Processor Two CPU Processors Four CPU Processors Eight CPU Processors

End-to-end

39.09

72.51

126.48

149.30

Receiving throughput (messages/sec)

38.00

74.00

137.00

158.00

Processing throughput (documents/sec)

75.00

140.00

210.00

282.00

Scaling Out SQL Server

This section provides general guidance on optimizing message box performance by scaling out SQL Server, and shows the performance impact of various deployments involving different numbers of BizTalk Server and SQL Server computers.

The deployment configuration of the message box has a significant impact on overall system throughput and scalability. To assess the scalability of the MessageBox database, different tests were run with varying numbers of BizTalk Server and SQL Server computers. The following figure shows the deployment topology for the scale-out testing of the MessageBox database in BizTalk Server 2004.

Figure 7 Deployment topology for MessageBox database scale-out testing

<No Change>

The test cases for scaling out SQL Server used the same scenario as the tests for scaling up SQL Server.

Scaling Out SQL Server Test Results

This section explains how overall system throughput is affected by various message box configurations. It also includes some significant findings from the performance testing (such as the optimal number of BizTalk Servers per message box, and why scaling the message box from one to two SQL Server computers might not necessarily improve performance). The following table shows the performance impact on throughput with different numbers of message boxes.

Table 4 Performance impact on throughput

Number of Message Boxes Throughput (documents per second)

1

73.62

2

64.86

Single Message Box

For this scenario, a deployment with one SQL Server computer hosting the MessageBox database can support up to six BizTalk Server computers at an optimal throughput rate of approximately 266 documents per second (with tracking disabled). Adding BizTalk Server computers beyond six does not improve throughput because the single message box becomes the bottleneck.

Scaling out the message box from one to two SQL Server computers does not improve throughput because in this scenario, one server is dedicated as the primary message box. Every BizTalk Server group must have only one primary message box.

The primary message box handles all subscriptions and message routing, while other message boxes handle message publishing. Instance subscriptions (subscriptions that route messages to already-running instances on specific message box servers) are maintained on the primary message box and on the specific message box that hosts the particular instance.

To alleviate lock contentions, message publishing is disabled on the primary message box and only the other message boxes can publish messages. Therefore, overall system throughput does not improve because there is still only one server publishing messages.

Running multiple message boxes requires distributed transactions. Therefore, overall performance decreases from the additional network-traffic overhead.

Multiple Message Boxes

As your system grows and capacity requirements change, a deployment involving a single message box configuration might require scaling out. If the message box is the bottleneck in your deployment, scale up SQL Server first, and then scale out.

A scenario involving three SQL Server computers can support eight BizTalk Server computers at an optimal throughput rate of approximately 419 documents per second. Four SQL Server computers can support 10 BizTalk Server computers at an optimal throughput rate of approximately 520 documents per second. Eventually, the primary message box becomes the bottleneck.

If there are too many message boxes for the number of BizTalk Server computers (that is, if BizTalk Server does not create enough workload), CPU processor utilization on the primary message box can reach 100 percent because of the performance overhead associated with routing data to all the different SQL Server computers. To determine if the primary message box is the bottleneck, check the following SQL Server performance counters:

  • SQL Server Lock Wait Time. Ideally, less than 100 milliseconds
  • SQL Server Lock Timeouts. Ideally, less than 100 milliseconds

When running the message box across multiple SQL Server computers, aggregate the tracking data from each SQL Server computer onto a single database on a separate dedicated tracking server.

Scaling Out BizTalk Server in an Orchestration Scenario

To show the scalability of BizTalk Server 2004, the testing involved the following variables:

  • Receiving tier was scaled out from one to two BizTalk Server receive computers.
  • Transmitting tier was scaled out from one to two BizTalk Server send computers.
  • Orchestration was scaled out from one to four BizTalk Server computers.

The following table shows the BizTalk Server host configuration.

Table 5 BizTalk Server host configuration

Number of BizTalk Server Computers Host Configuration

1

Receive, send, orchestration

2

One receive and send, one orchestration

3

One receive, one send, and one orchestration

6

Two receive, two send, and two orchestration

7

Two receive, two send, and three orchestration

8

Two receive, two send, and four orchestration

The following figure shows the deployment topology for the scale-out testing of BizTalk Server 2004 in an orchestration scenario.

Figure 8 Deployment topology for BizTalk Server 2004 scale-out testing

<No Change>
Scaling Out BizTalk Server in an Orchestration Scenario Description

The scenario for this series of tests is the same scenario as the one used for the scalability testing of BizTalk Server 2004 in a messaging scenario (described in the earlier topic, "Scaling Out BizTalk Server in a Messaging Scenario"), except that this scenario involves three message boxes and includes dedicated servers for processing orchestrations. For inbound traffic, the File adapter is used with an XML disassembler and XML assembler pipeline component. For port configuration, one receive port (one receive location) and one send port are configured. The orchestration is a simple schedule involving decision and transformation. Other details about this scenario are the same as in the messaging scenario.

The following figure shows the data flow diagram for the scalability testing of BizTalk Server 2004 in an orchestration scenario.

Figure 9 Data flow diagram for BizTalk Server 2004 scalability testing

<No Change>

This series of tests used both paths in the decision tree: one half of received documents were accepted, and the other half were rejected.

Scaling Out BizTalk Server in an Orchestration Test Results

The following table compares the end-to-end throughput statistics between the orchestration and messaging scenarios. End-to-end throughput is the total time from message reception (when a file is dropped in the receive directory) to message transmission (when a file is dropped in the outbound directory).

Table 6 Comparison of end-to-end throughput statistics

Number of BizTalk Server Computers End-to-End Throughput (documents/sec) -----------

------------

Orchestration

Messaging

1

50.21

64.52

2

93.75

109.09

3

117.65

208.70

6

216.22

315.79

7

256.68

419.21

8

241.21

-------

With one BizTalk Server computer, all BizTalk Server functionality (receiving, sending, and processing) runs on the same server. Scaling out from one to two BizTalk Server computers provides near-linear scaling because orchestration processing is isolated onto a dedicated BizTalk Server computer.

For this scenario, scaling out from two to three BizTalk Server computers improves throughput minimally because in this configuration, the receiving and sending functionality are isolated onto separate dedicated servers. Therefore, there is still only one dedicated server for processing orchestrations. If CPU processor utilization and memory consumption are not maximized on the receive and send servers, then it is unnecessary to add a dedicated server for processing orchestrations because you can run orchestration on the receive and send servers.

Scaling out from three to six BizTalk Server computers provides near-linear scaling because each of the three primary areas of functionality (receive, send, and orchestration) has twice the processing resources. Throughput reaches a peak when scaling out from six to seven BizTalk Server computers because beyond seven BizTalk Server computers, the three orchestration servers cause bottlenecks on the MessageBox databases. For this scenario, the end-to-end throughput rate decreases from 256.68 documents per second with seven BizTalk Server computers to 241.21 documents per second with eight BizTalk Server computers.

Scaling out from seven to eight BizTalk Server computers causes slight performance degradation in overall throughput because the eighth BizTalk Server computer in the deployment configuration is the fourth orchestration server, which only adds contention on the message box. With more orchestration servers, the message box becomes the bottleneck with an increase on the following SQL Server performance counters:

  • SQL Server Lock Wait Time. Ideally, less than 100 milliseconds
  • SQL Server Lock Time Out. Ideally, less than 100 milliseconds
  • Current Disk Queue Length. Ideally, the current disk queue length minus the number of spindles should be less than 2 (that is, fewer than two queued items per spindle is good)
  • Percent CPU utilization. Ideally, 70 to 75 percent
  • Processor Queue Length. Ideally, 10 per processor

To improve the throughput and scalability of BizTalk Server 2004 in an orchestration scenario involving more than three orchestration servers, scale up the database tier by upgrading to faster CPU processors, adding CPU processors, and including additional memory.

Scaling Out BizTalk Server in a Messaging Scenario

Scaling out BizTalk Server 2004 in a messaging scenario involves adding multiple BizTalk Server computers to a deployment until the MessageBox database in SQL Server becomes the bottleneck due to lock contentions. If SQL Server is the bottleneck, scale out the database tier by adding multiple SQL Server computers for message box processing. To show the scalability of BizTalk Server 2004, the testing involved the following variables:

  • Receiving tier was scaled out from one to four BizTalk Server receive computers
  • Transmitting tier was scaled out from one to four BizTalk Server send computers
  • Database tier was scaled out from one to four SQL Server computers (hosting MessageBox databases)

The tests for scaling out BizTalk Server 2004 in a messaging scenario use the same deployment topology as the topology for scaling out SQL Server, which is described in the earlier topic, "Scaling Out SQL Server."

Scaling Out BizTalk Server in a Messaging Scenario Description

The objective for this series of tests was to measure the processing speed and throughput of BizTalk Server 2004 as more computers are added to the deployment. To show the scalability of BizTalk Server 2004 with different numbers of BizTalk Server and SQL Server computers, a single messaging scenario was used in all of the tests. The only variables were the numbers of servers.

The messaging scenario involves using the FileGen.exe tool to drop 12,000 documents (ranging from 2 to 6 KB in size) to each BizTalk Server (receive server) at periodic intervals. For every document that BizTalk Server receives, two documents are processed and sent. To maintain optimal throughput, the messaging workload on the BizTalk Server computers was maximized for the duration of the test runs. Adding excessive workload to the receive servers causes excessive lock contentions on the MessageBox database.

The BizTalk Server receiving, processing, transmitting, and tracking features were isolated onto separate hosts. The testing also involved the following configurations:

  • All databases were created on a dedicated SQL Server computer with high-speed storage area network (SAN) drives.
  • All database and log files were stored on separate disks.
  • Each message box was pre-sized to 10.0 GB for database files and 5.0 GB for log files.
  • For multiple SQL Server computers hosting MessageBox databases, publishing was disabled on the primary message box. Other message boxes handled all publishing so that the primary message box was dedicated to message routing.
  • The SQL Server Agent service was running on all SQL Server computers.
  • The MessageBox and Tracking databases were deleted after each test run.
  • All BizTalk Server service instances were restarted after each test run.

The messaging scenario does not use orchestration. For inbound traffic, the File adapter was used with an XML disassembler and a pass-through pipeline component.

For port configuration, one receive port (one receive location) and four send ports (with filters) were configured. Three maps on the send ports transformed the data based on the filter (which checked for the <MsgInd> promoted property).

The data flow diagram used for the scalability testing of BizTalk Server 2004 in a messaging scenario is the same as the data flow diagram used in the scalability testing of the message box, which is described in the earlier topic, "Scaling Up SQL Server Scenario Description."

Scaling Out BizTalk Server in a Messaging Scenario Test Results

The following table shows the throughput, in documents per second, of various deployments involving different numbers of BizTalk Server and SQL Server computers (without tracking).

Table 7 Throughput of various deployments

Number of BizTalk Server Computers One Message Box Two Message Boxes Three Message Boxes Four Message Boxes

1

73.62

64.86

64.52

67.04

2

114.29

104.35

109.09

106.19

4

228.57

212.39

208.70

207.79

6

266.67

292.68

315.79

313.04

8

274.68

256.00

419.21

412.90

10

-------

-------

-------

520.83

For the tests involving multiple BizTalk Server computers, the BizTalk Server configuration used an equal number of receive and send computers. For example, in the tests with two BizTalk Server computers, one computer was dedicated as the receive server while the other was dedicated as the send server. With four BizTalk Server computers, there were two receive servers and two send servers. The same configuration was used for the tests involving six, eight, and ten BizTalk Server computers.

With one SQL Server computer hosting the message box, throughput scales linearly up to four BizTalk Server computers. When more than four BizTalk Server computers run message box functionality against the same SQL Server computer, the SQL Server computer becomes the bottleneck due to SQL Server lock contentions on the MessageBox database. Because the message box on SQL Server handles all message publishing and routing for BizTalk Server, it encounters performance degradation from the overhead of running message box functionality for multiple BizTalk Server computers.

To scale beyond four BizTalk Server computers, a deployment must have more than one SQL Server computer for the message box. Scaling out the database tier by including additional SQL Server computers for running the message box helps maintain linear scaling for BizTalk Server 2004.

With two SQL Server computers running the message box, throughput scales linearly up to six BizTalk Server computers. When more than six BizTalk Server computers run message box functionality against the two SQL Server computers, the SQL Server computers become the bottleneck due to SQL Server lock contentions on the MessageBox databases. To scale beyond six BizTalk Server computers in a deployment with two SQL Server computers, include additional SQL Server computers for running the message box.

With three SQL Server computers running the message box, throughput scales linearly up to eight BizTalk Server computers. When more than eight BizTalk Server computers run message box functionality against the three SQL Server computers, the SQL Server computers become the bottleneck due to SQL Server lock contentions on the MessageBox databases. To scale beyond eight BizTalk Server computers in a deployment with three SQL Server computers, include additional SQL Server computers for running the message box.

With four SQL Server computers running the message box, throughput scales linearly up to 10 BizTalk Server computers. When more than 10 BizTalk Server computers run message box functionality against the four SQL Server computers, the SQL Server computers become the bottleneck due to SQL Server lock contentions on the MessageBox databases. To scale beyond 10 BizTalk Server computers in a deployment with four SQL Server computers, include additional SQL Server computers for running the message box.

This section shows the performance of key components and configurations of Microsoft BizTalk Server 2004 through tables, graphs, and charts. It captures all of the data from the performance testing. This section has the following subsections:

  • Orchestrations. Shows the throughput and resource utilization of various types of orchestrations (such as a basic orchestration, orchestration with a simple filter expression, orchestration with an atomic scope, and an orchestration that starts another orchestration).
  • Transport Adapters. Shows how various transport adapters and file sizes affect throughput and overall performance.
  • Pipeline Components. Shows the performance impact of various pipeline components (including XML assembler/disassembler, SMIME/MIME, and flat file parser) and schema complexities.
  • Schema Complexity. Shows the performance impact that schema complexity has on the XML Assembler/Disassembler and flat file parsing pipeline components.
  • Interchanges. Shows the latency associated with various sizes of interchanges (large messages containing several individual messages) and how to approximate an optimal interchange size.
  • Tracking. Shows the throughput and resource utilization of various types of tracking scenarios (such as messaging and orchestration with a single MessageBox database, and deleting data from multiple MessageBox databases).
  • Business Rules. Shows how various parameters (such as the number of rules and conditions) in a rule set impact performance, and how throughput is affected by various bindings (including class, XML document, and data row).

Orchestrations

While the configuration of orchestrations can widely vary, depending on the particular business process being captured, the performance of orchestrations in BizTalk Server 2004 can be characterized by the most common types of orchestrations. This section provides performance test results for the following types of orchestrations:

  • Basic orchestration
  • Orchestration with a simple filter expression
  • Orchestration with an atomic scope
  • Orchestration with an atomic scope using Distributed Transaction Coordinator (DTC)
  • Orchestration with a sequential convoy
  • Orchestration with dehydration
  • Orchestration calling another orchestration
  • Orchestration starting another orchestration
  • Singleton orchestration with batched messaging
  • Orchestration exposed as a Web service

The testing involved one client computer and the following six servers:

  • Two BizTalk Servers for transport (receive/transmit)
  • One BizTalk Server for orchestration
  • One BizTalk Server for tracking
  • One SQL Server for the MessageBox and Configuration databases
  • One SQL Server for the Tracking database

The following figure shows the deployment topology for testing orchestrations in BizTalk Server 2004.

Figure 10 Deployment topology for testing orchestrations

ms864801.Local_-103517739_orchestrationdeploy(en-US,BTS.10).gif

Orchestration Scenario Description

The objective for this series of tests is to measure the end-to-end throughput of different basic orchestrations. Each test is run until CPU utilization on the BizTalk Server orchestration host reaches 100 percent. A client application sends messages to BizTalk Server 2004, where a custom C# application forwards the messages from the receive location through the File adapter. To apply stress to the test scenario, Microsoft Application Center Test (ACT) is used with a two-way SOAP port. The testing involves the following four BizTalk Server hosts:

  • Two BizTalk Servers for receive
  • Two BizTalk Servers for send
  • One BizTalk Server for orchestrations
  • One BizTalk Server for tracking

This document provides more details about each orchestration (including a data flow diagram) after the test results section.

Orchestration Test Results

The following table shows a comparison of performance statistics across the different types of orchestrations. Use this performance information to guide your design decisions for configuring orchestrations. Whenever possible, run orchestrations on a dedicated BizTalk Server because they generally consume more resources than receiving and sending messages.

Table 8 Comparison of performance statistics

Orchestration Type Average Number of Orchestrations Completed per Second Average Percent CPT Time (Orchestration) Average Percent CPU Time (Message Box)

Basic

141.1

91.6

56.3

With simple filter expression

139.8

92.4

54.9

With an atomic scope

66.0

88.1

26.4

With a sequential convoy

110.6

88.6

59.6

With dehydration

54.4

92.4

26.5

With an atomic scope using DTC

54.1

96.8

23.2

Calling another orchestration

140.7

89.4

52.4

Starting another orchestration

90.2

95.4

28.3

Singleton orchestration with batched messaging

13.4*

47.4

89.2

Exposed as a Web service

135.5*

69.9

62.7

* Average number of documents processed per second

Note that for the singleton orchestrations with batched messaging scenarios, BizTalk Server 2004 processes approximately 67 documents per second on average because each document contains five individual messages. This orchestration includes a sequential convoy so it consumes more message box resources.

For orchestrations that start another orchestration, the actual number of requests processed is approximately half of the average number of orchestrations completed per second.

Orchestration Details

This section provides a description and diagram of how each orchestration is configured for this series of tests.

Basic Orchestration

This test establishes a performance baseline for a basic orchestration in BizTalk Server 2004. The basic orchestration tested contains one receive shape and one send shape (File transport). This orchestration provides the highest performance because it does no processing; it only receives and sends messages. The following figure shows the orchestration diagram in BizTalk Server 2004 Orchestration Designer.

Figure 11 Basic orchestration diagram

Basic orchestration
Orchestration with a Simple Filter Expression

This test measures the performance overhead of adding a simple filter expression to an orchestration (where the receive shape has the Activate property set to True). This orchestration contains one receive shape and one send shape (File transport). The following figure shows the orchestration diagram in BizTalk Server 2004 Orchestration Designer.

Figure 12 Simple filter expression orchestration diagram

Simple filter expression
Orchestration with an Atomic Scope

This test measures the performance overhead of using a blank scope in an orchestration to perform some simple in-memory operations. The orchestration contains one receive shape and one send shape (File transport).

Compared to the previous orchestrations tested, orchestrations with an atomic scope have higher CPU utilization because they require additional communication with the MessageBox database for data persistence. The following figure shows the orchestration diagram in BizTalk Server 2004 Orchestration Designer.

Figure 13 Orchestration with an atomic scope

Atomic scope
Orchestration with an Atomic Scope Using DTC

While the previous orchestration uses a blank scope, this orchestration contains an expression shape that instantiates a COM+ object, which executes a query against a remote database. The File transport is used for both receive and send ports. The following figure shows the orchestration diagram in BizTalk Server 2004 Orchestration Designer.

Figure 14 Orchestration with an atomic scope using DTC

Atomic scope using DTC
Orchestration with a Sequential Convoy

A sequential convoy has two receive shapes bound to the same port. The orchestration uses a correlation with one promoted property. The File transport is used for both receive and send ports. The following figure shows the orchestration diagram in BizTalk Server 2004 Orchestration Designer.

Figure 15 Orchestration with a sequential convoy

sequential convoy

To test performance, the inbound publishing rate is increased until CPU utilization on the SQL Server (hosting the MessageBox database) reaches 100 percent.

Orchestration with Dehydration

Dehydration is a state that results from high memory consumption, and leads BizTalk Server 2004 to unload orchestrations from memory and persist them to disk. While other types of orchestrations (such as an orchestration with an atomic scope or an orchestration with a sequential convoy) can have dehydration, they might not dehydrate if sufficient memory is available. This scenario causes memory contention to impose dehydration.

This orchestration contains a one-minute delay shape that causes it to dehydrate after several other instances are created. The File transport is used for both receive and send ports. The following figure shows the orchestration diagram in BizTalk Server 2004 Orchestration Designer.

Figure 16 Orchestration with dehydration

Dehydration

The performance is volatile for orchestrations with dehydration because dehydration and rehydration occur frequently in this scenario. End-to-end throughput reached an average of 37.2 orchestrations dehydrated/second, and 39 orchestrations rehydrated/second.

Orchestration Calling Another Orchestration

For this test, both orchestrations run under the same host. The File transport is used for both receive and send ports. The following figure shows the orchestration diagrams in BizTalk Server 2004 Orchestration Designer.

Figure 17 Orchestration calling another orchestration

Calling another orchestration
Orchestration Starting Another Orchestration

For this test, both orchestrations run under the same host. The File transport is used for both receive and send ports. The following figure shows the orchestration diagrams in BizTalk Server 2004 Orchestration Designer.

Figure 18 Orchestration starting another orchestration

Starting another orchestration
Singleton Orchestration with Batched Messaging

For this test, the orchestration uses one promoted property and one distinguished field. Five hundred different documents are sent, resulting in 500 unique orchestration instances (with batches of five messages being sent). The File transport is used for both receive and send ports. The following figure shows the orchestration diagram in BizTalk Server 2004 Orchestration Designer.

Figure 19 Singleton orchestration with batched messaging

Singleton orchestration

To test performance, the inbound publishing rate is increased until CPU utilization on the SQL Server (hosting the MessageBox database) reaches 100 percent. The performance is primarily limited by the CPU on the SQL Server (that hosts the MessageBox database) because orchestrations with sequential convoys use a publish/subscribe model (where latency is inherent).

Orchestration Exposed as a Web Service

For this test, the orchestration uses a two-way port (request-response) with a Web services adapter (SOAP). The following figure shows the orchestration diagram in BizTalk Server 2004 Orchestration Designer.

Figure 20 Orchestration exposed as a Web service

Orchestration exposed as a Web service

To test performance, the inbound publishing rate is increased until CPU utilization on the SQL Server (hosting the MessageBox database) reaches 100 percent.

Transport Adapters

This section describes the throughput associated with sending varying file sizes across the following transport adapters:

  • File adapter
  • HTTP adapter
  • Web services adapter (SOAP)
  • BizTalk Message Queuing (MSMQT) adapter
  • SQL adapter
  • FTP adapter
  • SMTP adapter

This section also includes transport-related performance information for messaging and Web application scenarios. To measure performance, the tests involve one client computer that sends files of various sizes to each transport adapter. The following figure shows the deployment topology for testing transport adapters in BizTalk Server 2004.

Figure 21 Deployment topology for testing transport adapters

<No Change>

Transport Adapters Scenario Description

The scenario uses a basic orchestration containing one receive shape and one send shape. For comparison, the testing also involves a scenario with no orchestration. The inbound and outbound adapters are changed in each test, according to the specific transport adapter being tested.

Transport Adapters Test Results

This section explains the performance characteristics for the different transport adapters.

File Adapter

To measure the throughput and resource utilization from file reception (receive) to outbound transmission (send), files of various sizes were sent from a client computer. The data flow diagram used for the performance testing of the File adapter in BizTalk Server 2004 is the same as the data flow diagram used for the scalability testing of the message box, which is described in the earlier topic, "Scaling Up SQL Server Scenario Description." To set an upper limit and determine the maximum file size that the File adapter can support, a series of tests were run to measure the latency of receiving and sending various files ranging from 1 GB to 4 GB in size.

Ultimately, the maximum file size limitation is a restriction imposed by the physical resources on the system. Scaling up the hardware (such as faster CPU processors, increased memory, and larger hard disks) can increase the maximum file size supported. The File adapter processed a 1.0 GB file in 21 minutes, and a 2.0 GB file in 40.3 minutes. The following table shows the throughput for the various file sizes tested.

Table 9 Throughput for tested file sizes

File Size Throughput

1.0 GB

21.0 minutes

2.0 GB

40.3 minutes

3.0 GB

65.0 minutes

4.0 GB

93.0 minutes

When file sizes are small, throughput is limited by CPU processing and the rate at which the system can perform basic file operations. As file sizes grow, throughput (KB/sec) increases but overall performance (files/sec) decreases.

Toward the upper limit of the maximum file size limitation (typically around 5.0 GB), the receiving host becomes the bottleneck because it must parse incoming files, compress the data, and persist the data into the MessageBox database. In situations where system throughput is limited by the receiving host, the CPU utilization on other hosts (orchestration and send) begins to decrease. The following table shows the resource utilization on the BizTalk Server and SQL Server for various file sizes.

Table 10 Resource Utilization for various file sizes

File Size Average Percent CPU Time (receive) Average Percent CPU Time (orchestration) Average Percent CPU Time (send) Average Percent CPU Time (SQL Server)

2 KB

19%

26%

5%

30%

20 KB

18%

26%

5%

25%

200 KB

27%

14%

4%

17%

2,000 KB

29%

3%

1%

4%

20,000 KB

25%

1%

1%

3%

200,000 KB

20%

1%

< 1%

1%

The following table shows the throughput of the File adapter with and without orchestration.

Table 11 Throughput of the File adapter with and without orchestration

File Size With Orchestration With Orchestration Without Orchestration Without Orchestration

---------

KB/sec

Files/sec

KB/sec

Files/sec

2 KB

103

51.0

171

85.7

20 KB

920

46.0

1,400

70.0

200 KB

4,804

24.0

4,804

24.0

2,000 KB

10,520

5.3

10,520

5.3

HTTP Adapter

To optimize performance for the outbound transport, the send host is separated from the orchestration host. The performance of the HTTP adapter is similar to the performance of the File adapter up to 200 MB (throughput increases with file size). Beyond 200 MB, throughput decreases rapidly because large HTTP POST requests are divided into fragments before being sent. At the destination, the fragments are combined to build the message. The following table shows the throughput for the HTTP adapter.

Table 12 Throughput for the HTTP adapter

File Size KB/sec Files/sec

2 KB

81

41

20 KB

720

36

200 KB

4,000

20

2,000 KB

6,000

3

20,000 KB

15,200

0.76

200,000 KB

2,000

0.01

As file size increases, so does the time required to upload the file. Therefore, you should carefully adjust the connection time-out settings in Internet Information Services (IIS) when sending large files through the HTTP adapter. For information about modifying the connection settings, see the IIS documentation.

The following table describes the registry settings that affect the performance of the HTTP adapter. By default, the registry has no HTTP adapter keys.

Table 13 Registry settings that affect HTTP adapter performance

Key Name Adapter Default Description

HttpBatchSize

HTTP receive

10

This key specifies the size of the HTTP receive adapter batch. It is in the HTTPReceive subdirectory. This key can improve the latency of the HTTP receive adapter if set to 1.

Minimum value: 1

Maximum value: 256

RequestQueueSize

HTTP receive

256

This key defines the number of concurrent requests that the HTTP receive adapter processes at one time.

Minimum value: 10

Maximum value: 2048

HttpOutMaxConnection

HTTP send

5

This key specifies the number of open connections.

Minimum value: 1

Maximum value: 128

HttpOutInflightSize

HTTP send

100

This key specifies the size of the inflight queue.

Minimum value: 1

Maximum value: 1024

Create these DWORD registry keys in the following registry locations:

  • For the HttpBatchSize and RequestQueueSize keys,
    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BTSSvc.3.0\HttpReceive
  • For the other keys,
    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BTSSvc{GUID} where GUID is the ID of the host for the HTTP send handler
Web Services Adapter

The scenario for the performance testing of the Web services adapter is the same as the one used for testing the File and HTTP adapters, except that the orchestration is exposed as a Web service. The inbound adapter is exposed through SOAP while outbound traffic uses the File adapter. Configuring the scenario this way helps to isolate and measure the throughput with SOAP as the inbound adapter.

The performance of the Web services adapter is similar to the performance of the File adapter (throughput increases with file size). With large file sizes (larger than 200 MB), throughput decreases rapidly because the default settings in IIS cause retries and time-out errors. Performance tests for the Web services adapter were run with file sizes up to 200 MB. The following table shows the throughput for the Web services adapter.

Table 14 Throughput for the Web services adapter

File Size KB/sec Files/sec

2 KB

108

54

20 KB

1,021

51

200 KB

3,449

17

2,000 KB

5,816

2

20,000 KB

4,437

0.22

The schema is simplified for this scenario because the receive adapter cannot support multidimensional arrays (deeply-nested schemas). By default, the maximum file size that can be uploaded by IIS is 4 MB to restrict possible denial-of-service attacks. For information about increasing and modifying this setting, go to http://go.microsoft.com/fwlink/?linkid=27097.

BizTalk Message Queuing (MSMQT) Adapter

For this scenario, inbound traffic flows through the File adapter while outbound traffic flows through the BizTalk Message Queuing (also known as MSMQT) adapter. Because it is difficult to scale a single queue, a total of eight inbound queues and four outbound queues were used to demonstrate the full capacity of the BizTalk Message Queuing adapter. Sending messages to multiple queues requires modification to the application design. The following figure shows the data flow diagram for the performance testing of the BizTalk Message Queuing adapter in BizTalk Server 2004.

Figure 22 Data flow diagram for BizTalk Message Queuing adapter

<No Change>

As with the File adapter, the throughput of the BizTalk Message Queuing adapter increases with file size. As file sizes increase, the receiving host becomes the bottleneck. While the testing used file sizes up to 1 GB, there is no maximum file size limitation. File size is primarily limited by the amount of physical memory on the system as the Message Queuing (also known as MSMQ) client loads the message into memory. The following table shows the resource utilization on the BizTalk Server and SQL Server for various file sizes.

Table 15 Resource utilization for various file sizes

File Size Average Percent CPU Time (receive and send) Average Percent CPU Time (orchestration) Average Percent CPU Time (SQL Server)

2 KB

37%

10%

12%

20 KB

42%

12%

14%

200 KB

49%

7%

9%

2,000 KB

53%

1%

2%

20,000 KB

61%

1%

2%

200,000 KB

47%

1%

2%

The following table shows the throughput for the BizTalk Message Queuing adapter.

Table 16 Throughput for the BizTalk Message Queuing adapter

File Size KB/sec Files/sec

2 KB

80

40

20 KB

766

38

200 KB

2,730

22

2,000 KB

4,466

1.3

20,000 KB

6,144

0.3

200,000 KB

9,557

0.05

To have multiple BizTalk Servers process BizTalk Message Queuing requests, configure multiple receive locations (multiple BizTalk Message Queuing queues) so that each thread can process requests in parallel (concurrently).

SQL Adapter

This section provides an overview of the scenario for testing the SQL adapter, and explains the test results for both inbound and outbound adapters. For this series of tests, the SQL adapter retrieves data from a dedicated input/output server, and not the message box. Instead of using the SQL adapter for both inbound and outbound traffic, the tests isolate the adapter for each scenario for measurement accuracy and reliability.

Inbound Adapter

For this scenario, the inbound port uses the SQL adapter, and the Continuous Polling feature is enabled. (There is more information about continuous polling at the end of this topic). For outbound traffic, the File adapter drops outbound messages to a designated folder.

The BizTalk Server receives a single message from each polling of the SQL adapter. The single inbound message can be separated into multiple messages by using an XML disassembler pipeline component and an envelope schema. The following table shows the inbound resource utilization on the BizTalk Server and SQL Server for various file sizes.

Table 17 Inbound resource utilization

File Size Average Percent CPU Time (receive) Average Percent CPU Time (orchestration) Average Percent CPU Time (send) Average Percent CPU Time (SQL Server)

2 KB

19%

19%

15%

16%

20 KB

21%

20%

15%

15%

200 KB

25%

21%

8%

9%

2,000 KB

25%

27%

2%

3%

20,000 KB

24%

30%

1%

2%

The following table shows the inbound throughput for the SQL adapter.

Table 18 Inbound throughput for the SQL adapter

File Size KB/sec Files/sec

2 KB

93

46

20 KB

817

40

200 KB

3,192

15

2,000 KB

5,495

2

20,000 KB

5,597

0.27

Depending on the number of files that BizTalk Server processes, the maximum size of data that the SQL adapter can support is typically 50 MB. Add memory to support larger file sizes. Configuration of stored procedures, tables, and schemas has a significant impact on adapter throughput. When writing stored procedures, estimate the size of the result set to improve performance. If stored procedures retrieve more rows or data than needed, resources (such as CPU cycles) are wasted and performance degrades. In scenarios involving higher SQL Server lock contentions and concurrency, stored procedures should override the transaction isolation level.

To optimize performance of the SQL adapter, enable the Continuous Polling feature by setting the poll while data found property to True. This setting allows the adapter to retrieve data continuously (with no wait statements) until the stored procedure returns no rows—at which point, the adapter switches to sleep mode for the duration of the polling interval.

When the polling interval is too small, the SQL adapter tries to load all of the messages simultaneously and memory consumption increases, depending on the physical resources on the SQL Server.

Outbound Adapter

For this scenario, the inbound port uses the File adapter to apply incoming load on the system, and outbound traffic uses the SQL adapter. The following table shows the outbound resource utilization on the BizTalk Server and SQL Server for various file sizes.

Table 19 Outbound resource utilization for various file sizes

File Size Average Percent CPU Time (receive) Average Percent CPU Time (orchestration) Average Percent CPU Time (send) Average Percent CPU Time (SQL Server)

2 KB

17%

18%

43%

24%

20 KB

21%

18%

39%

22%

200 KB

15%

7%

49%

10%

2,000 KB

3%

1%

53%

1%

20,000 KB

1%

< 1%

54%

< 1%

The following table shows the outbound throughput for the SQL adapter.

Table 20 Outbound throughput for the SQL adapter

File Size KB/sec Files/sec

2 KB

62

31

20 KB

553

27

200 KB

1,733

8

2,000 KB

27

0.01

20,000 KB

40

0.002

For the outbound scenario, the send host consumes most of the CPU processing cycles and becomes the bottleneck because it must convert the XML file into SQL Server data, and then insert that data into the database. The largest message supported by the SQL adapter for outbound traffic is 50 MB.

FTP Adapter

For this scenario, the inbound batch size was set to 20 through the Batch: Maximum Files property. The File adapter performs approximately 10 times faster than the FTP adapter because the FTP adapter reads messages sequentially, instead of concurrently, for each receive location. By default, the incoming batch size is set to 0, leading the adapter to enumerate and retrieve all of the files at once. When there are large numbers of files in the FTP site, the default configuration results in poor performance. Plan for your volume of traffic and set the batch size appropriately; the recommendation is 20. The following table shows the throughput for the FTP adapter.

Table 21 Throughput for the FTP adapter

File Size KB/sec Files/sec

2 KB

8

4.3

20 KB

88

4.4

200 KB

836

4.1

2,000 KB

3,884

1.9

20,000 KB

4,710

0.23

The FTP adapter supports multiple locations in multiple threads, but reads files from a single location sequentially.

SMTP Adapter

The most common scenario for SMTP involves using the SMTP adapter for outbound traffic. For this scenario, the File adapter is used for inbound traffic and the SMTP adapter is used for outbound traffic.

For outbound traffic, the SMTP adapter consumes more memory than the File adapter, with the SMTP adapter providing 50 percent less throughput than the File adapter. There is no file-size limitation for the SMTP adapter because file size is limited by the physical resources on the system (especially memory), and the number of concurrent SMTP messages loaded simultaneously.

As file sizes increase (greater than 2 MB), the system might encounter out-of-memory errors under high load. Therefore, adjust the watermark threshold (decrease the high-low watermark threshold) for the messaging agent when using the SMTP adapter for outbound traffic. The following table shows the resource utilization on the BizTalk Server and SQL Server for various file sizes.

Table 22 Resource utilization for various file sizes

File Size Average Percent CPU Time (receive) Average Percent CPU Time (orchestration) Average Percent CPU Time (send) Average Percent CPU Time (SQL Server)

2 KB

6%

12%

32%

12%

20 KB

7%

12%

32%

11%

200 KB

2%

4%

8%

4%

2,000 KB

8%

1%

18%

1%

The following table shows the throughput for the SMTP adapter.

Table 23 Throughput for the SMTP adapter

File Size KB/sec Files/sec

2 KB

46

23

20 KB

413

20

200 KB

1066

5

2,000 KB

3413

2

Messaging Scenario for Transport Adapters

This section provides performance information about the throughput associated with different types of transport adapters in a typical messaging scenario. The testing involves one BizTalk Server and one SQL Server. The deployment topology used for testing the transport adapters in a messaging scenario is the same as the topology used for testing the transport adapters, which is described in the earlier topic, "Transport Adapters."

Scenario Description

To measure the throughput of different transport adapters in a typical messaging scenario, a series of tests was run to simulate retail customers who want to manage their inventory. These customers check their inventory through requests (XML messages typically less than 10 KB in size) that BizTalk Server 2004 processes.

The tests measured the following transport adapter configurations:

  • File to File
  • HTTP to HTTP
  • BizTalk Message Queuing (MSMQT) to MSMQ
  • SOAP to File

For inbound traffic, the BizTalk Server uses an XML disassembler pipeline component and a pass-through pipeline component. When it receives a message, BizTalk Server reads the message header for routing information, and then routes it to two different locations (out of four total locations). For outbound messages, BizTalk Server uses maps to transform each message. The data flow diagram used for the performance testing of the transport adapters in a typical messaging scenario is the same as the data flow diagram used for the scalability testing of the message box, which is described in the earlier topic, "Scaling Up SQL Server Scenario Description."

Test Results

The four transport adapter configurations tested had approximately the same performance, with outbound map transformation causing the bottleneck. The following chart shows the throughput for the different transport adapters tested in the messaging scenario.

Figure 23 Throughput for transport adapters

<No Change>

For the BizTalk Message Queuing adapter, the testing involved three incoming queues (BizTalk Server receive locations) to optimize thread utilization.

Web Application Scenario for Transport

This section shows a sample case study based on the most basic real-world scenarios. It provides performance information about Secure Sockets Layer (SSL) overhead in a typical messaging scenario.

The testing consisted of the following computers:

  • One BizTalk Server for receive
  • One BizTalk Server for orchestration
  • One BizTalk Server for send
  • One SQL Server for all databases
  • One client computer for generating requests
  • One client computer for receiving responses

The following figure shows the deployment topology for the testing of SSL throughput in a messaging scenario.

Figure 24 Deployment topology for SSL testing

<No Change>
Scenario Description

To measure the performance impact of using SSL, tests were run with the following transport adapter configurations:

  • HTTP to HTTP
  • SOAP to File

A basic orchestration was used for testing the HTTP adapter. The following figure shows the data flow diagram for the HTTP scenario.

Figure 25 Data flow diagram for HTTP

<No Change>

For the inbound port in the SOAP-to-File scenario, the same orchestration used for testing the HTTP adapter was exposed through SOAP as a Web service. For outbound traffic, the File adapter drops outbound messages to a designated folder. The following figure shows the data flow diagram for the SOAP-to-File scenario.

Figure 26 Data flow diagram for SOAP-to-File

<No Change>
Test Results

For the SOAP-to-File scenario, using SSL added approximately 13 percent more performance overhead than when SSL is not used. For the HTTP-to-HTTP scenario, using SSL added approximately 25 percent more performance overhead than when SSL is not used. The following table shows the throughput for the adapter configurations tested with and without SSL.

Table 24 Throughput for adapter configurations tested with and without SSL

Adapter Configuration Documents Processed/sec (without SSL) Documents Processed/sec (with SSL)

HTTP to HTTP

13

10

SOAP to File

15

12

Pipeline Components

This section describes the performance of BizTalk Server 2004 with various pipeline components (including XML Assembler/Disassembler, SMIME/MIME, and flat-file parser) and schema complexities. A pipeline is a piece of software infrastructure that contains a set of .NET or COM components that process messages in a predefined sequence.

The Microsoft Visual Studio Toolbox is populated with several standard BizTalk Server components that you can use to create a pipeline. This section describes the performance characteristics of these pipeline components in BizTalk Server.

SMIME/MIME, Party Resolution, and XML Pipeline

BizTalk Server 2004 pipeline components provide a set of security-related features such as SMIME, MIME, and Party Resolution. This section describes the performance impact of various receive and send pipeline components using SMIME/MIME and Party Resolution, relative to default XML and pass-through pipeline components.

The testing of SMIME/MIME and Party Resolution in BizTalk Server 2004 consisted of one client computer, one BizTalk Server, and one SQL Server. The deployment topology used for testing SMIME/MIME and Party Resolution in BizTalk Server 2004 was the same as the deployment topology used for testing the transport adapters, which is described in the earlier topic, "Transport Adapters."

SMIME/MIME, Party Resolution, and XML Pipeline Scenario Description

This scenario involves a simple messaging implementation with no orchestrations. Separate hosts are created for send, receive, and tracking functions.

SMIME/MIME, Party Resolution, and XML Pipeline Test Results

Each component was tested for its individual effect on performance. To measure this difference, a baseline performance was determined by using the default pass-through pipelines; no orchestrations or maps were included. Two sets of tests were conducted. The following tables show the permutations tested for both the receive/inbound side and the send/outbound side.

Receive/Inbound

Table 25 Receive/Inbound test results

Receive Pipeline Receive Pipeline Receive Pipeline Receive Pipeline Send Pipeline

Decode Stage

Disassemble Stage

Validate Stage

Resolve Party Stage

---------------

MIME/SMIME

Regular

-------

-------

-------

Pass-through

MIME/SMIME

BlobSigned

-------

-------

-------

Pass-through

MIME/SMIME

BlobSigned + Encrypted

-------

-------

-------

Pass-through

-------

XML Disassembler (w/o validation)

-------

-------

Pass-through

-------

XML Disassembler (with validation)

-------

-------

Pass-through

-------

-------

XML Validator

-------

Pass-through

-------

-------

-------

Party Resolution

Resolve by Cert

Pass-through

Send/Outbound

Table 26 Send/Outbound test results

Receive Pipeline Send Pipeline Send Pipeline Send Pipeline

---------------

Pre-assemble Stage

Assemble Stage

Encode Stage

Pass-through

-------

-------

MIME/SMIME Regular

Pass-through

-------

-------

MIME/SMIME BlobSigned

Pass-through

-------

-------

MIME/SMIME BlobSigned +

Encrypted

Pass-through

-------

XML Assembler

-------

Pass-through

-------

-------

XML Validator

Receive/Inbound

The following table shows the results for all permutations of the receive/inbound tests.

Table 27 Permutations of receive/inbound tests

Throughput vs. File Size -------- 2 KB 20 KB 200 KB 2 MB 20 MB

Baseline (pass-through)

Files/sec

200.86

128.65

32.54

3.75

0.40

-------

KB/sec

401.72

2573.00

6507.33

7506.67

8000.00

MIME Decoder

Files/sec

141.04

87.58

12.24

1.19

0.12

-------

KB/sec

282.08

1757.00

2448.00

2380.00

2400.00

SMIME Decoder (signed)

Files/sec

137.89

64.60

7.00

0.62

0.10

-------

KB/sec

275.78

1292.00

1400.00

1246.67

1933.33

SMIME Decoder (signed and encrypted)

Files/sec

77.81

41.89

5.15

0.48

0.06

-------

KB/sec

155.63

837.87

1029.33

953.33

1200.00

Party Resolution (plus SMIME Decoder signed)

Files/sec

132.80

63.54

6.90

0.62

0.07

-------

KB/sec

265.61

1270.87

1379.33

1246.67

1466.67

XML Disassembler (w/o validation)

Files/sec

154.51

111.94

27.03

3.07

0.34

-------

KB/sec

309.02

2238.80

5406.00

6146.67

6733.33

XML Disassembler (with validation)

Files/sec

141.19

90.92

18.87

1.93

0.27

-------

KB/sec

282.37

1818.47

3774.00

3866.67

5466.67

XML Validator

Files/sec

160.36

107.34

23.09

2.57

0.28

-------

KB/sec

320.72

2146.73

4618.67

5133.33

5666.67

Analysis of the receive pipeline MIME and XML components show the following trends:

  • SMIME (signed and encryption) is the most expensive operation.
  • Performance degradation was minimal between the MIME Decoder and SMIME Decoder (signed) scenarios. Signing involved adding a signature to the file.
  • While a larger number of smaller files could be processed, actual KB/second of throughput was higher for larger files.
  • Party Resolution with certificates has a minimal additional impact over the SMIME Decoder when used with it.
  • The pass-through pipeline component performs 20 to 30 percent faster than the XML Assembler/Disassembler pipeline component.
Send/Outbound

The following table shows the results for all permutations of the send/outbound tests.

Table 28 Permutations of send/outbound tests

Throughput vs. File Size -------- 2 KB 20 KB 200 KB 2 MB 20 MB

Baseline (pass-through)

Files/sec

200.86

128.65

32.54

3.75

0.40

-------

KB/sec

401.72

2573.00

6507.33

7506.07

8000.00

MIME Encoder

Files/sec

121.50

93.94

17.33

1.57

0.24

-------

KB/sec

243.00

1878.87

3466.67

3146.67

4800.00

SMIME Encoder (signed)

Files/sec

83.73

62.64

14.02

1.49

.21

-------

KB/sec

167.47

1252.73

2804.00

2986.67

4133.33

SMIME Encoder (signed and encrypted)

Files/sec

73.64

54.00

6.61

0.97

0.12

-------

KB/sec

147.29

1080.00

1322.00

1946.67

2400.00

XML Assembler

Files/sec

130.68

85.72

14.55

1.67

0.20

-------

KB/sec

261.37

1714.33

2910.67

3333.33

3933.33

XML Validator

Files/sec

148.52

97.73

18.41

2.18

0.24

-------

KB/sec

297.04

1954.60

3681.33

4366.67

4800.00

Analysis of the receive pipeline MIME and XML components show the following trends:

  • As with the Receive pipeline, smaller files allowed a higher number of files to be processed, but the larger files resulted in higher kilobyte throughput.
  • The XML Validator and MIME Encoder have comparable performance, as do the XML Assembler and SMIME Encoder (signed).

Schema Complexity

Schema complexity has a significant impact on performance in BizTalk Server 2004. This section describes the performance impact that schema complexity has on the XML Assembler/Disassembler and flat file parsing pipeline components.

XML Pipeline Components and Schema Complexity

The XML Assembler pipeline component combines XML serializing and assembling in one component. It can transfer (promote and demote) properties from the message context back into envelopes and documents.

The XML Disassembler pipeline component combines XML parsing and disassembling into one component. It removes envelopes, disassembles the interchange, and promotes the content properties from interchange and individual document levels on to the message context.

The testing involved one client computer, one BizTalk Server, and one SQL Server. The deployment topology used for testing pipelines in BizTalk Server 2004 is the same as the deployment topology used for the testing of the transport adapters, which is described in the earlier topic, "Transport Adapters."

XML Pipeline Components and Schema Complexity Scenario Description

This series of tests was run against a simple orchestration with a receive shape and a send shape using the File transport adapter. This scenario includes the following BizTalk Server 2004 host configurations:

  • One BizTalk Server for receive
  • One BizTalk Server for send
  • One BizTalk Server for orchestrations

The data flow diagram used for the performance testing of pipelines in BizTalk Server 2004 was the same as the data flow diagram used for the scalability testing of the message box, which is described in the earlier topic, "Scaling Up SQL Server Scenario Description."

XML Pipeline Components and Schema Complexity Test Results

The purpose of the tests in this section is to demonstrate the performance impact due to different levels of complexity of the schema. To maximize the workload on the XML Assembler/Disassembler, files of relatively small sizes were used. Large-sized files would not stress the XML Assembler/Disassembler efficiently because it uses more resources for compressing and persisting the data to the message box than for assembling and disassembling the XML data. Therefore, it would be difficult to measure and observe how schema complexity impacts performance.

Both the breadth and depth of XML fields in a schema have an impact on performance. As the number of fields in the XML file grows, the throughput decreases. Similarly, as the depth of the schema increases, the throughput also decreases.

For the following tests, the inputs were adjusted to maximize the CPU utilization of the BizTalk Server (to get the maximum throughput). To measure the impact that the breadth and depth of XML schemas have on performance, two test permutations were used. The following paragraph lists the different inputs used for both permutations.

XML schema complexity in breadth

  • Each document for this particular test is 20 KB. The variable is the number of the fields for each document. For this test, the following inputs were used:
  • 1 field
  • 2 fields
  • 100 fields
  • 1000 fields

XML schema complexity in depth

  • 2 levels
  • 20 levels
  • 200 levels
  • 2000 levels

The following table shows the performance of BizTalk Server with various XML file complexities.

Table 29 Performance of BizTalk Server with XML file complexities

Number of Fields per Schema Files/sec

1

72.47

10

74

100

72.03

1000

58.16

ms864801.note(en-US,BTS.10).gifNote
Schemas that contain up to one million fields and are up to 199 MB in size have been tested; no maximum limit for the breadth was found.

The following table shows the relationship between the schema complexity in breadth and overall throughput.

Table 30 Relationship of breadth and overall throughput

Depth of Each Field per Schema Files/sec

2

73.03

20

73.03

200

66.67

2000

58.35

Flat-File Parsing

Two factors have the highest impact on the performance of flat-file parsing: file size and schema complexity. An ambiguous schema is a schema that contains many optional fields. When large file sizes are used, these schemas can degrade performance because larger files may match different branches of the schema. Schema complexity has less impact on smaller files than on larger files.

The testing consisted of one client computer, one BizTalk Server, and one SQL Server. The deployment topology used for the testing of flat-file parsing in BizTalk Server 2004 was the same as the deployment topology used for the testing of the transport adapters, which is described in the earlier topic, "Transport Adapters."

Flat-File Parsing Scenario Description

The scenario is a simple messaging scenario that has an outbound pipeline connected directly to an inbound pipeline. The hosts for the two parts of the scenario are separated so that the outbound pipeline can be stopped without affecting the inbound pipeline. The HTTP protocol is used for inbound message traffic because it allows more control of the workload.

To focus on the performance impact based on the file size, a simple schema was used for this testing. The test files were built by repeating records to fit a size pattern. The test runs consisted of sending the test messages to the inbound port and measuring the publishing time.

Flat-File Parsing Test Results

The following table shows how the throughput and processing capacity in KB/second change with the file size.

Table 31 File size changes in throughput and processing capacity

Full Size (KB) Files/sec KB/sec

0.4

159.78

61.16

1

112.94

113.82

2

77.25

155.11

4

49.63

202.01

40

5.49

223.22

400

0.54

220.96

4000

0.05

217.39

The amount of data goes flat at a certain point because the time spent in communication and database access is almost insignificant compared to the actual processing time of the messages

Schema Complexity

The base schema used for the tests is a schema with 40 nodes. The maximum depth of the schema is six nodes and all the nodes are required; some nodes will repeat. All the schemas used are derived from this base. The constraints used to create the four schemas are whether all fields are optional or required, and whether the parser optimization is set for speed or complexity. The message used for measurement is 500 KB in size.

The parsing mode is an attribute on the schema Info record, with two modes: speed and complexity. In speed mode, the parser tries to fit data as it appears in the stream. In complexity mode, the flat-file parsing engine uses both top-down and bottom-up parsing, and tries to fit data more accurately. The following table shows the results of the test.

Table 32 Schema complexity test results

Schema Type Parser Optimization Processing Rate per Minute Cost/Message (megacycles)

No optional fields

Speed

44.14

8369

No optional fields

Complexity

43.15

8559

All fields optional

Speed

40.42

9227

All fields optional

Complexity

26.21

14665

The cost is given in processor megacycles per second and is calculated as:

Calculation formula

This formula gives a machine-independent cost. The variables are defined as follows:

  • F = CPU frequency in MHz (megacycles/second executed by a CPU)
  • N = number of physical CPUs on the computers
  • p = CPU utilization (percent)
  • k = 1.3 coefficient recommended by Intel when hyperthreading is enabled
  • r = throughput rate in number of units (messages) processed per second.

From the tests, we observed that the performance decreases as the complexity of the schema grows. This result occurs because the parser retries parsing to find the best match when the schema has ambiguous definitions.

In case of an ambiguous schema, the complex parser setting will have impact on the general throughput because of the best match algorithm used during complex parsing. In the given runs there is almost a 50 percent drop in throughput for the given input file just by setting the optimization to complex.

  • The same complex data definitions may cause an impact on loading time and memory consumption. At load time the parser grammar and parse table are generated automatically and the number of states the parser requires increases dramatically. That is why, for certain large schemas (a few megabytes in size), you can expect schema loading time to grow from minutes to hours.

Interchanges

This section describes the performance of large interchange file support in BizTalk Server 2004. An interchange is a single message that contains multiple messages. After BizTalk Server 2004 receives an interchange, it disassembles the interchange into its individual messages. This section provides details about the throughput involved with using single and multiple interchanges in BizTalk Server 2004.

To measure the performance and throughput associated with large interchange file support, the testing was conducted across two servers: one running BizTalk Server 2004 and the other running Microsoft SQL Server™ 2000.

Interchanges Scenario Description

The BizTalk Server is configured to run in three separate hosts: receive, send, and tracking. Note that the receive and send processes are running in separate hosts. All of the default settings are used except for the following transport configurations:

  • The receive port is bound to a send port (no orchestration).
  • The receive pipeline contains the Flat File Disassembler.
  • The send pipeline uses the default XML Assembler.
  • The message type is FlatFile Interchange(s).
  • Each message is a 1 KB flat file.

Latency is measured as the total time from interchange reception (when the interchange is dropped in the receive directory) to destination arrival (when the interchange is dropped in the outbound directory). The interchange size is defined as the total number of messages contained in the interchange.

Interchanges Test Results

For BizTalk Server 2004, latency increases dramatically when the interchange size exceeds 400,000 messages (with 2.0 GB of RAM). Adding more memory allows you to support interchange sizes larger than 400,000 messages. The performance degradation beyond this interchange size is typically attributed to SQL Server rather than to BizTalk Server because SQL Server encounters lock contentions on the MessageBox database. The following figure shows the latency of single interchanges.

Figure 27 Latency of single interchanges

Latency of single interchanges

To reduce the latency for processing interchanges, reduce the size of each interchange (that is, divide large interchanges into multiple interchanges containing fewer messages). The following table shows the latency involved when a single interchange containing 400,000 messages is divided into multiple smaller interchanges.

Table 33 Latency relationship between interchange number and size

Number of Interchanges Interchange Size (messages) Latency (hours)

1

400,000

2:40:19

4

100,000

2:34:50

8

50,000

2:03:22

20

20,000

1:41:26

40

10,000

1:35:49

80

5,000

1:31:02

While the number of interchanges might increase to accommodate the reduction in interchange size, latency is reduced and overall performance increases. The following figure shows the latency of multiple interchanges.

Figure 28 Latency of multiple interchanges

Latency of multiple interchanges

When supporting large interchanges in BizTalk Server 2004, multiple smaller interchanges utilize the CPU processor more efficiently than fewer large interchanges. As a general guideline, use the following formula to determine the maximum size of an interchange for any given deployment (number of CPU processors):

Maximum number of messages per interchange <= 200,000 / (Number of CPUs * BatchSize * MessagingThreadPoolSize)

The fixed value of 200,000 is the result of dividing the maximum number of messages in an interchange (typically 400,000) by 2 to account for the SQL Server lock contentions that each loaded thread uses (in addition to the SQL Server lock contentions used by each message in an interchange).

You can modify the following variables to enhance thread utilization and performance. The following sections provide procedures for setting the Batch Size value and specifying the messaging thread pool size:

  • Number of CPUs specifies the number of CPU processors in the BizTalk Server.
  • BatchSize specifies the number of messages per thread that should be received before BizTalk Server processes the interchange. The recommended value for BatchSize is 1.
  • MessagingThreadPoolSize is a registry key that defines the number of threads per processor on the BizTalk Server. The recommended value for MessagingThreadPoolSize is 1.
ms864801.note(en-US,BTS.10).gifNote
You should modify these variables only on hosts that process large interchanges (when dropping multiple large interchanges to the receive location). Modifying these variables for other scenarios might result in performance degradation.

The example below shows how the formula might apply for a deployment that has the following characteristics:

  • Number of processors on the BizTalk Server is four.
  • BatchSize value is set to 1.
  • MessagingThreadPoolSize is set to 1.
4 (number of CPUs) * 1 (BatchSize) * 1 (MessagingThreadPoolSize) = 200,000 / maximum number of messages per interchange

Therefore, in the example deployment, you can drop an infinite number of interchanges to the receive location as long as each interchange contains a maximum of 50,000 messages (200,000 divided by 4).

Setting the Batch Size Value

Use the following procedure to set the BatchSize value. This value is the desired maximum number of messages per thread that should be received before BizTalk Server processes the interchange.

To set the BatchSize value

  1. Click Start, point to All Programs, point to Microsoft Visual Studio .NET 2003, and then click Microsoft Visual Studio .NET 2003.
  2. In Visual Studio .NET 2003, open an existing BizTalk Server project.
  3. In the BizTalk Server project, click View, and then click BizTalk Explorer.
  4. In BizTalk Explorer, expand BizTalkMgmtDb.dbo, expand Receive Ports, expand the specific receive port for the project, expand Receive Location, expand the specific receive location for the project, right-click the receive location, and then click Edit.
  5. In the Receive Location Properties - Configuration - General dialog box, for Address URI, click the ellipsis (...) button.
  6. In the FILE Transport Properties dialog box, set the BatchSize value to 1, and then click OK. By default, this value is set to 20 for the FILE adapter.
Specifying the Messaging Thread Pool Size

Use the following procedure to create a registry key that specifies the messaging thread pool size. This value specifies a limit to the number of threads that a single processor can spawn.

To specify the messaging thread pool size

  1. Click Start, and then click Run.
  2. In the Run dialog box, in the Open box, type regedit, and then click OK.
  3. In Registry Editor, expand HKEY_LOCAL_MACHINE, expand SYSTEM, expand CurrentControlSet, expand Services, right-click BTSSvc*, point to New, click DWORD Value, type MessagingThreadPoolSize, and then press ENTER.
  4. In Registry Editor, double-click MessagingThreadPoolSize.
  5. In the Edit DWORD Value dialog box, in the Value data box, type 1, and then click OK.

Tracking

This section describes the performance impact of tracking in BizTalk Server 2004. The core BizTalk Server 2004 features (messaging, pipelines, orchestration, and business rules) can generate tracking information while processing messages. This tracking information allows you to monitor specific aspects of your application and system.

BizTalk Server 2004 stores all tracking data in the MessageBox database. The BizTalk Server tracking service is a dedicated Windows NT service application that moves tracking data from the MessageBox database to a separate Tracking database, and deletes the data from the message box. The BizTalk Server 2004 Health and Activity Tracking (HAT) tool and Business Activity Monitoring (BAM) use this Tracking database.

Performance Impact of Tracking with a Single Message Box

This section describes the deployment configuration and scenario descriptions for this series of tests. To measure the performance impact of tracking in BizTalk Server 2004, the testing involved one client computer and the following six servers:

  • One BizTalk Server for accepting messages (receive)
  • One BizTalk Server for processing messages (orchestration)
  • One BizTalk Server for sending messages (transmit)
  • One SQL Server for hosting the MessageBox database
  • One SQL Server for running the BizTalk Server tracking service
  • One SQL Server for hosting the Tracking databases

The following figure shows the deployment topology for the performance testing of tracking in BizTalk Server 2004.

Figure 29 Deployment topology for tracking performance testing

ms864801.Local_569564801_trackingdeploy(en-US,BTS.10).gif
Tracking Configuration

To configure tracking, BizTalk Server 2004 provides various settings for the tracking service, tracking interceptors, the Health and Activity Tracking (HAT) tool, and Business Activity Monitoring (BAM). The following table shows the various tracking configurations that were set for all of the tests in the tracking section.

Table 34 Tracking configurations set for the tracking tests

Tracking Configuration Tracking Service/Interceptors HAT Pipeline Components/Message Properties BAM Configuration

Default

On/On

Receive: Inbound (Off), Outbound (Off)Send: Inbound (Off), Outbound (Off)

N/A

Tracking service off

Off/On

N/A

N/A

Interceptors off

Off/Off

N/A

N/A

HAT 1

On/On

Receive: Inbound (Off), Outbound (On)Send: Inbound (On), Outbound (Off)One promoted property

N/A

HAT 2

On/On

Receive: Inbound (Off), Outbound (Off)Send: Inbound (Off), Outbound (Off)One promoted property

N/A

HAT 3

On/On

Receive: Inbound (On), Outbound (On)Send: Inbound (On), Outbound (On)One promoted property

N/A

BAM (no HAT)

On/On

N/A

4 Activities4 Data Items

BAM (with HAT 3)

On/On

Receive: Inbound (On), Outbound (On)Send: Inbound (On), Outbound (On)One promoted property

4 Activities4 Data Items

By default, BizTalk Server 2004 enables the tracking service and tracking interceptors. You can disable the tracking service and/or interceptors through the BizTalk Server Administration tool, or the BizTalk Server Windows Management Instrumentation (WMI) interfaces.

When the tracking service is disabled, BizTalk Server stops moving tracking data from the MessageBox database to the Tracking database. With the tracking interceptors disabled, BizTalk Server stops collecting tracking data in the message box.

To enable or disable the tracking of pipeline components and message properties, use the HAT configuration menu. BizTalk Server automatically enables BAM tracking after you define business activities and associate them with orchestrations.

Single Message Box Scenario Description

To test the performance impact of tracking in BizTalk Server 2004 with a single message box, two scenarios were used: messaging and orchestration. This section describes each scenario.

Messaging Scenario with One Message Box

In this messaging scenario, BizTalk Server 2004 receives a document from a file folder and routes it to other folders. To perform the routing, BizTalk Server uses a filter based on a promoted property of the message content.

After receiving the document, BizTalk Server processes it with filters, transformation maps, and XML disassembler pipeline components. BizTalk Server uses the XML pass-through pipeline component for sending the document. This scenario does not involve orchestration; the next section describes an orchestration scenario. The data flow diagram used for testing BizTalk Server 2004 tracking in a messaging scenario with one message box is the same as the data flow diagram used for the scalability testing of the message box, which is described in the earlier topic, "Scaling Up SQL Server Scenario Description."

Orchestration Scenario with One Message Box

In this orchestration scenario, BizTalk Server 2004 uses a basic orchestration to determine whether inbound requests should be accepted or rejected. The determination is made based on one field of the request, and a response is sent accordingly. The orchestration scenario uses the same pipeline components as the previous messaging scenario. For each document that it receives, BizTalk Server processes one document and sends one document. The following figure shows the data flow diagram for the performance testing of BizTalk Server 2004 tracking in an orchestration scenario with one message box.

Figure 30 Data flow diagram for orchestration with one message box

ms864801.Local_-2056606825_trackingdeployorch(en-US,BTS.10).gif
Test Results for Tracking with One Message Box

This section shows the test results for the messaging and orchestration scenarios. The following table shows the test results for the messaging scenario.

Table 35 Test results for the messaging scenario

Tracking Configuration Documents Received per Second Documents Processed per Second

Default

118.17

216.7

Tracking service off

118.13

236.8

Interceptors off

122.16

242.2

HAT 1

112.17

218.7

HAT 2

78.2

138.4

HAT 3

79.1

155.5

The bottleneck for this scenario and deployment configuration was the CPU processor utilization on the orchestration server. The throughput statistics were obtained by maximizing the CPU processor utilization on the orchestration server. You can alleviate this bottleneck by adding servers to distribute the workload. The following table shows the test results for the orchestration scenario.

Table 36 Test results for the orchestration scenario

Tracking Configuration Documents Processed per Second Percent Change from Default Configuration

Default

144.59

0.00%

Tracking service off

169.64

17.32%

Interceptors off

169.06

16.92%

HAT 1

141.62

-2.06%

HAT 2

132.87

-8.11%

HAT 3

126.70

-12.37%

BAM (no HAT)

138.00

-4.56%

BAM (with HAT 3)

111.12

-23.15%

For the scenarios above, all tracking configurations had similar performance except the configuration where tracking was disabled. The orchestration scenario provided better performance than the messaging scenario because the messaging scenario involved transformation maps and processed two documents for each document received (the orchestration scenario sent one document for each document received). The orchestration scenario also included an additional server dedicated to processing orchestrations, thereby increasing the capacity of BizTalk Server to receive, process, and send documents more efficiently.

In the messaging scenario, the default tracking configuration had minimal impact on throughput. Throughput increased by 6 percent when tracking was off, and by 13 percent when the tracking service and interceptors were off. Message-body tracking in the HAT 1 configuration (where tracking is on for the receive outbound pipeline and send inbound pipeline) degraded performance by less than 2 percent over the default tracking configuration. The default tracking configuration provided 36 percent and 31 percent more throughput than the HAT 2 and 3 configurations, respectively.

In the orchestration scenario, throughput increased by 17 percent when the tracking service and interceptors were off. The default tracking configuration provided up to 15 percent more throughput than message-body tracking in the HAT 1 configuration. The default configuration also provided 2 percent, 8 percent, and 12 percent more throughput in the HAT 1, 2, and 3 configurations, respectively. With HAT off, the default configuration provided 5 percent more throughput than with BAM tracking on, and 23 percent more than HAT message-body tracking with BAM on.

Performance Impact of Tracking with Multiple Message Boxes

BizTalk Server 2004 supports one Tracking database per application, regardless of the number of message boxes. In deployments with multiple message boxes, the amount of data that can be tracked multiplies. Therefore, tracking has a significant impact on performance when deployments involve multiple message boxes. To measure this performance impact, a series of tests was run with the following servers:

  • Two BizTalk Server computers for receiving messages (also running the tracking service)
  • Six BizTalk Server computers for sending messages (transmit)
  • Four SQL Server computers for hosting the MessageBox database

This series of tests used the same scenario as the scale-out messaging scenario, which is described in the earlier topic, "Scaling Out SQL Server." It involved the following three tests:

  • Disabled tracking to measure throughput under high sustained workload
  • Assumed default tracking configuration (enabled) and archived the Tracking database
  • Assumed default tracking configuration without archiving the Tracking database
Test Results for Tracking with Multiple Message Boxes

When tracking is off, BizTalk Server 2004 can process 516 documents per second at an incoming rate of 258 received documents per second (test duration was over 27 hours). When the incoming rate is increased, processing speed decreases but publishing rate increases. Improving overall performance requires finding the right balance between processing and publishing rates, and minimizing message accumulation in the application queue.

In the second series of tests where tracking is on and the Tracking database is archived, file reception was decreased to 181 to 184 documents per second to achieve a sustainable amount of workload, and the test duration lasted over 17 hours. If consistent load persists for longer periods of time, BizTalk Server encounters performance degradation until the load is alleviated. Performance degradation is most noticeable when processing and publishing speeds far exceed the rate at which the tracking service moves tracking data. For this configuration, the tracking service can move 441 messages per second.

In the third series of tests where tracking is on and the Tracking database is not archived, test duration lasted 10 hours at an incoming rate of 181 to 184 documents received per second. For this configuration, the tracking service can move 119 messages per second, and total average throughput over 17 hours was 260 documents processed per second.

Business Rules

This section describes the throughput of policy (rule set) execution in the Business Rule Engine. The testing involved one client computer, one BizTalk Server, and one SQL Server hosting the message box. The deployment topology used for the testing of the Business Rule Engine in BizTalk Server 2004 was the same as the topology used for the testing of the transport adapters, which is described in the earlier topic, "Transport Adapters."

Business Rules Scenario Description

The Policy object model was used in all test scenarios in this section. The policy execution was initiated from multiple threads, each thread using a unique Policy object instance; for example:

Thread n

Policy PolObj = new Policy(REFacts);
for (int i=0; i < NIterations; i++)
PolObj.Execute (REFacts);
PolObj.Dispose();

Business Rules Test Results

Facts are pieces of information about the world, and can originate from many sources, such as event systems, objects in business applications, and database tables. The Business Rule Engine accepts and operates on the following types of facts:

  • .NET objects (methods, properties, and fields)
  • XML documents (elements, attributes, and document subsections)
  • Database row sets (values from table column)
.NET Classes

The policy used in this test contains rules like the following:

If ( ClassN.Attribute >= constant ) 
Then {Do something}
ms864801.note(en-US,BTS.10).gifNote
Different .NET classes were used for the rules in the policy.

The application used to run the test executes the policy from four different threads. The average CPU utilization on the BizTalk Server hosting the Business Rule Engine is approximately 80 percent. The following graph shows the results of the test.

Figure 31 Business rules test results for .NET classes

<No Change>
XML Documents

The policy used in this test contains rules like the following:

If (XmlDocumentMemberN.Field >= constant ) 
Then { Do something}

The application used to run the test executes the policy from four different threads. The average CPU utilization on the BizTalk Server hosting the Business Rule Engine is approximately 90 percent. The following graph shows the results of the test.

Figure 32 Business rules test results for XML documents

<No Change>
Data Connections and Tables

The Business Rule Engine supports the following three database-related types:

  • TypedDataRow. Constructed by using a reference to an ADO.NET DataRow instance. The TypedDataRow is an obvious choice for rules that deal only with data from one or a small number of rows from a particular table.
  • TypedDataTable. Literally a collection of TypedDataRow objects. Each row in the database table is wrapped as a TypedDataRow and asserted into the working memory by the rule engine.
    A TypedDataTable requires an in-memory ADO.NET DataTable, which can be a performance overhead if this particular DataTable contains a very large number of rows. If a small number of rows in the database table are relevant and you can determine these rows before calling the rules, use a DataTable; otherwise use TypedDataRow. The assumption is that a high number of rows in the DataTable are relevant to the rules.
  • DataConnection. Represents a table in a database accessed through a database connection.
    The difference between DataConnection and TypedDataTable is that in addition to the dataset name and table name, DataConnection requires a usable database connection and optionally a database transaction context.

The test scenarios included in this section capture the performance differences between data connection and data table bindings and show which of these bindings is more efficient depending on the scenario.

Increasing the Number of Data Rows

This test used the following rule:

If (Table1.Column1 = 1) 
Then Class1.Property1 = Table1.Column1

The test started with a few rows in the test table and gradually increased the number of rows. The purpose of this test was to show how the data connection binding becomes more efficient than data table binding when the data set becomes larger. The following graph shows the results of the test.

Figure 33 Business rules test results for data connections and tables

<No Change>
Increasing the Number of Conditions

This test used a policy that contained 100 rules, and changed the number of unique conditions in the rules.


                        Rule N
If (Table1.Column1 = (N Mod NumberofUniqueConditions) 
Then Class1.Property1 = Table1.Column1

The purpose of the test was to show how the data connection binding becomes much more efficient when the rule set uses shared conditions/queries. The following graph shows the results of the test.

Figure 34 Business rules test results for increasing the number of conditions

<No Change>
Increasing Query Selectivity

For this test, multiple policies were used, and each policy contained only one rule:

If (Table1.Column1 <= query_value) 
Then Class1.Property1 = Table1.Column1

The purpose of this test was to show how data connection bindings are more efficient when the more restrictive queries/conditions are used. However, the performances of the data table and data connection bindings converge as the queries/conditions become less restrictive. The following graph shows the results of the test.

Figure 35 Business rules test results for increasing query selectivity

<No Change>
Optimization

Consider the following guidelines when optimizing DataConnection bindings:

  • Use primary keys. When there is a primary key, the equality of two rows is determined by whether the rows have the same primary key, rather than by object comparison. If the rows are determined to be the same, only one copy is retained in memory, and the other is released (resulting in less memory consumption).
    When a DataConnection is asserted into the rule engine for the first time, the engine always tries to locate its primary key information from its schema. If a primary key exists, primary key information is retrieved and used in all subsequent evaluations.
  • Provide running transactions. Without a transaction, each query and update on the DataConnection will initiate its own local transaction, and different queries might return different results in different parts of rule evaluations. Users may experience inconsistent behavior if there are changes in the underlying database table.
    Although you can use a DataConnection without providing a transaction when the table does not change over time, it is recommended that a transaction be used even when the DataConnection is only being used for read operations.
  • Use OR conditions with caution. If the rule uses only conjunctive (AND) conditions, tests and queries are executed as early as possible, so instances of objects passing through are reduced. As a result, the number of queries against the subsequent DataConnection is reduced proportionally. If disjunctive (OR) conditions and a DataConnection are used together in a rule, all condition evaluations are pushed to the final query. If more than one DataConnection is used in a rule, all queries except the last one effectively become a Select-ALL query statement.
    In general, it is better to split any rule with an OR condition into two or more discrete rules, because the use of OR conditions decreases performance compared to the definition of more atomic rules. This observation is true whether or not DataConnections are used.
    You might also consider using separate rules that consist only of conjunctive conditions instead of one rule with OR conditions. With OR conditions, the number of queries grows at the speed of multiplication of instances of all joining objects. This is shown in the following example:
    IF (A.x == 7 OR A.x == 8) AND DC.y == A.y 
    THEN DC.z =10
    
    In this example, A represents an ObjectBinding; DC represents a DataConnection, and x, y, and z represent attributes of A and DC. If A has 100 instances, and x is 1 in the first object, 2 in the second object, through 100 in the 100th object, 100 queries have to run against the DataConnection.
    It is better to rewrite the preceding rule by splitting it into two rules:
    
                            Rule 1 
    IF A.x == 7 AND DC.y==A.y
    THEN DC.z =10
    Rule 2
    IF A.x == 8 AND DC.y==A.y
    THEN DC.z =10
    
  • Use DataConnection instead of TypedDataTable. In many scenarios, using DataConnection provides better performance and consumes less memory than using TypedDataTable. However, TypedDataTable may be required in some cases because of certain restrictions on using DataConnection. In some other cases, using TypedDataTable may yield better performance than using DataConnection.
    Use TypedDataTable instead of DataConnection under the following conditions:
    • Data changes need to be made but the table does not have a primary key. To make data changes by using DataConnection, a primary key is required. Therefore, if there is no primary key, TypedDataTable is the only viable approach.
    • Selectivity is high, which means that a large percentage of rows in the table will pass the tests specified as rule conditions. In this case, DataConnection does not provide much benefit and it may perform worse than TypedDataTable.
    • The table is small—typically, a table that contains fewer than 500 rows. Note that this number could be larger or smaller depending on the rule shape and on the memory available to the rule engine.
    • Rule-chaining behavior is expected in the policy. Calling the Update function on a DataConnection is not supported, but you could invoke DataConnection.Update in a rule using a helper method. When rule chaining is required, TypedDataTable is a better choice.
    • One or more columns in the table hold a very large amount of data that is not required by the rules. An example is an image database, where the columns hold the image (large amount of data), name, date, and so on. If the image is not required, it may be better to select only the columns needed by the rules. For example, issuing a query such as "SELECT Name, Date from TABLE" can be more efficient than using DataConnection.
    • If many rules need or update the same database row, using a TypedDataTable, the row is shared between all rules, and if the condition is the same (for example, Table.Column == 5), the condition evaluation can be optimized. With a DataConnection, in general, a query is generated for each rule that uses the DataConnection. Although the rows are reused (if the table has a primary key), multiple queries could be generated to get the same data each time.
Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft