Generate reports and charts with performance counters with the Performance Analysis of Logs (PAL) Tool for BizTalk Server

The PAL (Performance Analysis of Logs) tool reads in a performance monitor counter log (any known format) and analyzes it using complex, but known thresholds (provided). The tool generates an HTML based report that graphically charts important performance counters and throws alerts when thresholds are exceeded. The thresholds are originally based on thresholds defined by the Microsoft product teams, including BizTalk Server, and members of Microsoft support. This tool is not a replacement of traditional performance analysis, but it automates the analysis of performance counter logs enough to help save you time. The PAL tool:

  • Analyzes performance counter logs for thresholds

  • Is helpful for large Perfmon logs

  • Identifies BizTalk Server and operating system performance counter bottlenecks by analyzing for thresholds

  • Is extensible to do analysis on any performance counters

  • Can be used to help write your own counter

    PAL is available as a free download at GitHub. It requires Microsoft Log Parser. Log Parser is a powerful, versatile tool that provides universal query access to text-based data such as log files, XML files, and CSV files, as well as key data sources on the Windows operating system such as the event log, the registry, the file system, and Active Directory® directory service. You may want to use this tool to query a significant amount of logging information. You can download the Log Parser tool.

Using PAL with Performance Counter Logs in Different Languages

The PAL tool analyzes performance counter logs only in English language. To use the PAL tool with performance counter logs in other languages, you must first translate the logs to English language. You can use the Perfmon Log Translator to translate the original performance counter log files to English.

Understanding the PAL Tool Report for Microsoft BizTalk Server 2010

The PAL tool provides Perfmon log analysis of thresholds for the Windows operating system, Microsoft SQL Server, and BizTalk Server. This section describes most of the analyses in the BizTalk Server threshold report in the PAL tool.

Note

This topic is long so that comprehensive information about the PAL tool can be contained in one place for easy reference.

The following analysis and thresholds are reported by the PAL tool.

Analysis Type and Name Analysis Description
Disk: Disk Free Space for a Kernel Dump This analysis checks to make sure there is enough free disk space for the operating system to dump all memory to disk. If insufficient disk space is available, then the operating system will fail to create a memory.dmp, file which is necessary to analyze the root cause of a kernel dump.
Disk: Logical/Physical Disk Interface Analysis This analysis looks at the idle time of each of the physical disks. The more idle the disk is, the less the disk is being used. This counter is best used when one disk is used in the logical disk. “% Idle Time” reports the percentage of time during the sample interval that the disk was idle.

Reference: Ruling Out Disk-Bound Problems
Disk: Logical/Physical Disk Read/Write Latency Analysis The most reliable way for Windows to detect a disk performance bottleneck is by measuring its response times. If the response times are greater than .025 (25 milliseconds), which is a conservative threshold, then noticeable slow-downs and performance issues affecting users may be occurring. For more information, refer to Logical/Physical Disk Read/Write Latency Analysis in this topic.
Disk: Logical Disk Transfers/sec “Disk Transfers/sec” is the rate of read and write operations on the disk. The thresholds for this analysis check to see whether any of the logical disks are showing poor response times (greater than 25 ms response times for I/O operations). If this is true, then the disk transfers per second should be at or above 80. If not, then the disk architecture needs to be investigated. The most common cause of poor disk I/O is LUN overloading on the SAN. For more information, refer to Logical Disk Transfers/sec in this topic.
Disk: LogicalDisk % Free Space “% Free Space” is the percentage of total usable space that was free on the selected logical disk drive. Performance should not be affected until the available disk drive space is less than 30 percent. When 70 percent of the disk drive is used, the remaining free space is located closer to the disk's spindle at the center of the disk drive, which operates at a lower performance level. Lack of free disk space can cause severe disk performance.

Disk: Process IO Data Operations/sec and Process IO Other Operations/sec Analysis These counters count all I/O activity generated by the process to include file, network and device I/Os. These analyses check when processes are doing more than 1,000 I/O’s per second and flag it as a warning. These analyses are best used in correlation with other analyses such as disk analysis to determine which processes might be involved in the I/O activity.
Memory: Available Memory This analysis checks whether the total available memory is low – Warning at 10 percent available and Critical at 5 percent available. A warning is also alerted when a decreasing trend of 10 MB’s per hour is detected, which can indicate a potential upcoming memory condition. Low physical memory can cause increased privileged mode CPU and system delays. For more information, refer to Available MemoryAnalysis in this topic.
Memory: Free System Page Table Entries Free System Page Table Entries (PTE’s) are the number of page table entries not currently used by the system. This analysis determines whether the system is running out of PTE’s by checking whether there are fewer than 5,000 free PTE’s with a Warning if there are fewer than 10,000 free PTE’s. Lack of enough PTE’s can result in systemwide hangs. Also note that the /3GB switch will lower the amount of free PTE’s significantly. For more information, refer to Free System Page Table Entries Analysis in this topic.
Memory: Memory Leak Detection This analysis determines whether any of the processes are consuming a large amount of the system's memory and whether the process is increasing in memory consumption over time. A process consuming large portions of memory is fine as long as the process returns the memory back to the system. Look for increasing trends in the chart. An increasing trend over a long period of time could indicate a memory leak. “Private Bytes” is the current size, in bytes, of memory that this process has allocated that cannot be shared with other processes. This analysis checks for 10 MB’s per hour and 5 MB’s per hour increasing trends. Use this analysis in correlation with the Available Memory analysis in PAL. For more information, refer to Memory Leak Detection Analysis in this topic.
Memory: Handle Leak Detection This analysis checks all of the processes to determine how many handles each has open and to determine whether a handle leak is suspected. A process with a large number of handles and/or an aggressive upward trend could indicate a handle leak, which typically results in a memory leak. The total number of handles currently open by this process is equal to the sum of the handles currently open by each thread in this process.

Reference: Debug Diagnostic Tool
Memory: Memory Pages Input/sec “Pages Input/sec” is the rate at which pages are read from disk to resolve hard page faults. Hard page faults occur when a process refers to a page in virtual memory that is not in its working set or elsewhere in physical memory, and must be retrieved from disk. This analysis checks whether there are more than 10 page file reads per second.
Memory: Memory Pages/sec This analysis checks to see whether the “Pages/sec” is high (above 1,000). If it is high, then the system is likely running out of memory by trying to page the memory to the disk. “Pages/sec” is the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause systemwide delays. Use this analysis in correlation with Available Memory analysis and Memory Leak Detection analysis in PAL. If all of these analyses are throwing alerts at the same time, then this may indicate the system is running out of memory and the suspected processes involved and follow analysis steps mentioned in the Memory Leak Detection analysis in PAL.

For more information, refer to Memory Pages/sec Analysis in this topic.
Memory: Memory System Cache Resident Bytes “System Cache Resident Bytes” is the size, in bytes, of the pageable operating system code in the file system cache. This value includes only current physical pages and does not include any virtual memory pages not currently resident. This value is a component of “Memory\\System Code Resident Bytes” which represents all pageable operating system code that is currently in physical memory. This counter displays the last observed value only; it is not an average. This analysis checks for an increasing trend of 10 MB’s per hour. Under load, a server might use the system cache in order to cache I/O activity such as disk. Use in correlation with Process IO Data Operations/sec and Process IO Other Operations/sec analyses in PAL.

Reference: File Cache Performance and Tuning
Memory: Pool Non Paged Bytes “Pool Nonpaged Bytes” is the size, in bytes, of the non-paged pool, an area of system memory for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated. This analysis checks to see whether the system is coming close to the maximum pool non paged memory size. It does this by estimating the pool sizes taking into consideration /3GB, physical memory size, and 32-bit/64-bit, then determining whether the value is higher than 60 percent of the estimated pool size. If the system becomes close to the maximum size, then the system could experience system wide hangs.

The /3GB switch option in the boot.ini file significantly reduces the size of this memory pool.

For more information, refer to Pool Non-Paged Bytes Analysis in this topic.
Memory: Pool Paged Bytes This analysis checks to see whether the system is coming close to the maximum pool paged memory size. It does this by estimating the pool sizes taking into consideration /3GB, physical memory size, and 32-bit/64-bit, then determining whether the value is higher than 60 percent of the estimated pool size. If the system becomes close to the maximum size, then the system could experience system wide hangs.

The /3GB switch option in the boot.ini file significantly reduces the size of this memory pool.

For more information, refer to Pool Paged Bytes Analysis in this topic.
Memory: Process Thread Count This analysis checks all of the processes to determine whether a process has more than 500 threads and if the number of threads is increasing by 50 threads per hour. A process with a large number of threads and/or an aggressive upward trend could indicate a thread leak which typically results in either a memory leak or high context switching. High context switching will result in high privileged mode CPU.
Memory: Process Working Set “Working Set” is the current size, in bytes, of the working set of this process. The working set is the set of memory pages touched recently by the threads in the process. If free memory in the computer is above a threshold, pages are left in the working set of a process even if they are not in use. When free memory falls below a threshold, pages are trimmed from working sets. If they are needed they will then be soft-faulted back into the working set before leaving main memory. This analysis checks for an increasing trend of 10 MB’s or more in each of the processes. Use in correlation with Available Memory analysis in PAL.

Reference: Finding and eliminating bottlenecks
Network: Network Output Queue Length Analysis This analysis checks to see how many threads are waiting on the network adapter. If a lot of threads are waiting on the network adapter, then the system is probably saturating the network I/O due to network latency or network bandwidth. “Output Queue Length” is the length of the output packet queue (in packets). Delays are indicated if this is longer than two, and the bottleneck should be found and eliminated, if possible. Typical causes of network output queuing include high numbers of small network requests and network latency.
Network: Network Utilization Analysis “Bytes Total/sec” is the rate at which bytes are sent and received over each network adapter, including framing characters. “Network Interface\Bytes Received/sec” is a sum of “Network Interface\Bytes Received/sec” and “Network Interface\Bytes Sent/sec”. This counter helps you know whether the traffic at your network adapter is saturated and whether you need to add another network adapter. How quickly you can identify a problem depends on the type of network you have as well as whether you share bandwidth with other applications. This analysis converts “Bytes Total/sec” to bits and compares it to the current bandwidth of the network adapter to calculate network utilization. Next, it checks for utilization above 50 percent.

Reference: Measure performance using EventCounters in .NET Core
Paging File: Paging File % Usage and % Usage Peak The amount of the page file instance in use in percent. See also “Process\\Page File Bytes”. This analysis checks whether the percentage of usage is greater than 70 percent.
Processor: Processor Utilization Analysis and Excessive Processor Use by Processes This counter is the primary indicator of processor activity and displays the average percentage of busy time observed during the sample interval. It is calculated by monitoring the time that the service is inactive and subtracting that value from 100 percent. This analysis checks for utilization greater than 60 percent on each processor. If so, determine whether it is high user mode CPU or high privileged mode. If high privileged mode CPU is suspected, then see the Privileged Mode CPU analysis in PAL. If a user-mode processor bottleneck is suspected, then consider using a process profiler to analyze the functions causing the high CPU consumption.
Processor: Processor Queue Length This analysis determines whether the average processor queue length exceeds the number of processors. If so, then this could indicate a processor bottleneck. Use this analysis in correlation with Privileged Mode CPU analysis and Excessive Processor Use by Process analysis in PAL. For detailed information, refer to Processor Queue Length Analysis in this topic.
Processor: Privileged Mode CPU Analysis This counter indicates the percentage of time a thread runs in privileged mode. When your application calls operating system functions (for example to perform file or network I/O or to allocate memory), these operating system functions are executed in privileged mode. This analysis checks to see whether privileged mode CPU is consuming more than 30 percent of total CPU. If so, then the CPU consumption is likely caused by another bottleneck other than the processor such as network, memory, or disk I/O. Use in correlation with Processor: % Interrupt Time and Processor: High Context Switching analyses in PAL. For more information, refer to Privileged Mode CPU Analysis in this topic.
Processor: Interrupt Time “% Interrupt Time” is the time the processor spends receiving and servicing hardware interrupts during sample intervals. This value is an indirect indicator of the activity of devices that generate interrupts, such as the system clock, the mouse, disk drivers, data communication lines, network interface cards and other peripheral devices. These devices normally interrupt the processor when they have completed a task or require attention. Normal thread execution is suspended during interrupts. Most system clocks interrupt the processor every 10 milliseconds, creating a background of interrupt activity. A dramatic increase in this counter indicates potential hardware problems. This analysis checks for “% Interrupt Time” greater than 30 percent. If this occurs, then consider updating devices drivers for hardware that correlates to this alert.

Reference: Measure performance using EventCounters in .NET Core
Processor: High Context Switching A context switch happens when a higher priority thread preempts a lower priority thread that is currently running or when a high priority thread blocks. High levels of context switching can occur when many threads share the same priority level. This often indicates that too many threads are competing for the processors on the system. As a general rule, context switching rates of less than 5,000 per second per processor are not worth worrying about. If context switching rates exceed 15,000 per second per processor, then there is a constraint. For more information, refer to High Context Switching Analysis in this topic.
Microsof BizTalk Server: BizTalk Dehydrating Orchestrations When many long-running business processes are running at the same time, memory and performance issues are possible. The orchestration engine addresses these issues by "dehydrating" and "rehydrating" orchestration instances. Dehydration is the process of serializing the state of an orchestration into a SQL Server database. Rehydration is the reverse of this process: deserializing the last running state of an orchestration from the database. Dehydration is used to minimize the use of system resources by reducing the number of orchestrations that have to be instantiated in memory at one time. Therefore, dehyrations save memory consumption, but are relatively expensive operations to perform. This analysis checks for dehydrations of 10 or more. If so, BizTalk Server may be running out of memory (either virtual or physical), a high number of orchestrations are waiting on messages, or the dehydration settings are not set properly.

Reference: Orchestration Dehydration and Rehydration
Microsoft BizTalk Server: BizTalk High Database Sessions This counter has two possible values: normal (0) or exceeded (1). This analysis checks for a value of 1. If so, BizTalk has exceeded the threshold of the number of database sessions permitted. This value is controlled by the “Database connection per CPU” value in the BizTalk Host Throttling settings. “Database connection per CPU” is the maximum number of concurrent database sessions (per CPU) allowed before throttling begins. You can monitor the number of active Database connections by using the Database session performance counter under the BizTalk:Message Agent performance object category. This parameter only affects outbound message throttling. For more information, refer to BizTalk High Database Sessions Analysis in this topic.
Microsoft BizTalk Server: BizTalk High Database Size This counter will be set to a value of 1 if either of the conditions listed for the message count in database threshold occurs. By default the host message count in database throttling threshold is set to a value of 50,000, which will trigger a throttling condition under the following circumstances:

- The total number of messages published by the host instance to the work, state, and suspended queues of the subscribing hosts exceeds 50,000.
- The number of messages in the spool table or the tracking table exceeds 500,000 messages.

If this occurs, then consider a course of action that will reduce the number of messages in the database. For example, ensure the SQL Server jobs in BizTalk Server are running without error and use the Group Hub page in the BizTalk Server Administration console to determine whether message build up is caused by large numbers of suspended messages. For more information, refer to BizTalk High Database Size Analysis in this topic.
Microsoft BizTalk Server: BizTalk High In-Process Message Count This analysis checks the High In-Process Message Count counter to determine whether this kind of throttling is occurring. If so, consider adjusting the “In-Process messages per CPU” setting. This parameter only affects outbound message throttling. Enter a value of 0 in the “In-Process messages per CPU” setting to disable throttling based on the number of in-process messages per CPU. The default value for the “In-Process messages per CPU” setting is 1,000. Note that modifying this value can also have an impact on low latency of messages and/or the efficiency of BizTalk resources. For more information, refer to BizTalk High In-Process Message Count Analysis in this topic.
Microsoft BizTalk Server: BizTalk High Message Delivery Rate This analysis checks for a value of 1 in the High Message Delivery Rate counter. High message delivery rates can be caused by high processing complexity, slow outbound adapters, or a momentary shortage of system resources. For more information, refer to BizTalk High Message Delivery Rate Analysis in this topic.
Microsoft BizTalk Server: BizTalk High Message Publishing Rate Inbound host throttling, also known as message publishing throttling in BizTalk Server, is applied to host instances that contain receive adapters or orchestrations that publish messages to the MessageBox database. This analysis checks for a value of 1 in the High Message Publishing Rate counter. If this occurs, then the database cannot keep up with the publishing rate of messages to the BizTalk MessageBox database.

References:

- Host Throttling Performance Counters
- How BizTalk Server Implements Host Throttling
- Modify Rate Based Throttling Settings
- What is Host Throttling?
Microsoft BizTalk Server: BizTalk High Process Memory The BizTalk Process Memory usage throttling threshold setting is the percentage of memory used compared to the sum of the working set size and total available virtual memory for the process if a value from 1 through 100 is entered. When a percentage value is specified the process memory threshold is recalculated at regular intervals. This analysis checks for a value of 1 in the High Process Memory counter. For more information, refer to BizTalk High Process Memory Analysis in this topic.
Microsoft BizTalk Server: BizTalk High System Memory The BizTalk Physical Memory usage throttling threshold setting is the percentage of memory consumption compared to the total amount of available physical memory if a value from 1 through 100 is entered. This setting can also be the total amount of available physical memory in megabytes if a value greater than 100 is entered. Enter a value of 0 to disable throttling based on physical memory usage. The default value is 0. For more information, refer to BizTalk High System Memory Analysis in this topic.
Microsoft BizTalk Server: BizTalk High Thread Count “Threads Per CPU” is the total number of threads in the host process including threads used by adapters. If this threshold is exceeded, BizTalk Server will try to reduce the size of the EPM thread pool and message agent thread pool. Thread based throttling should be enabled in scenarios where high load can lead to the creation of a large number of threads. This parameter affects both inbound and outbound throttling. Thread based throttling is disabled by default. For more information, refer to BizTalk High Thread Count Analysis in this topic.
Microsoft BizTalk Server: BizTalk Host Queue Length The BizTalk Host Queue Length tracks the total number of messages in the particular host queue. You can use length size, for example, BizTalk:MessageBox:HostCounters:Host Queue – Length, to give a more detailed view of the number of messages being queued up internally by showing the queue depth for an individual host. This counter can be useful in determining if a specific host is bottlenecked. Assuming unique hosts are used for each transport, this can be helpful in determining potential transport bottlenecks. This analysis checks for average queue lengths greater than 1.

The Host Queue Length is a weighted Queue length by aggregating the record count of all the Queues (Work Q, State Q, Suspended Q) of the target host.

Reference: BizTalk Server 2010: BizTalk Server Performance Testing Methodology
Microsoft BizTalk Server: BizTalk Host Suspended Messages Queue Length This counter tracks the total number of suspended messages for the particular host. A suspended message is an instance of a message or orchestration that BizTalk Server has stopped processing due to an error in the system or the message. Generally, suspended instances caused by system errors are resumable upon resolution of the system issue. Often, suspended instances due to a message problem are not resumable, and the message itself must be fixed and resubmitted to the BizTalk Server system.

The suspended message queue is a queue that contains work items for which an error or failure was encountered during processing. A suspended queue stores the messages until they can be corrected and reprocessed, or deleted. This analysis checks for any occurrence of suspended messages. An increasing trend could indicate severe processing errors.

References:

- Monitoring BizTalk Server Health and Performance

- Troubleshooting Microsoft BizTalk Server
BizTalk Server: BizTalk Idle Orchestrations Number of idle orchestration instances currently hosted by the host instance. This counter refers to orchestrations that are not making progress but are not dehydratable. This situation can occur when the orchestration is blocked, waiting for a receive, listen, or delay in an atomic transaction. If a large number of non-dehydratable orchestrations accumulate, then BizTalk may run out of memory.

Dehydration is the process of serializing the state of an orchestration into a SQL Server database. Rehydration is the reverse of this process: deserializing the last running state of an orchestration from the database. Dehydration is used to minimize the use of system resources by reducing the number of orchestrations that have to be instantiated in memory at one time. The engine dehydrates the instance by saving the state, and frees up the memory required by the instance. By dehydrating dormant orchestration instances, the engine makes it possible for a large number of long-running business processes to run concurrently on the same computer. This analysis checks for an increasing trend of one idle orchestration per hour.

Reference: Orchestration Dehydration and Rehydration.
BizTalk Server: BizTalk Inbound Latency Average latency in milliseconds from when the messaging engine receives a document from the adapter until the time it is published to the MessageBox. Reducing latency is important to some users of BizTalk, therefore tracking how much time documents spend in the inbound adapter is important. For more information, refer to BizTalk Inbound Latency Analysis in this topic.
BizTalk Server: BizTalk Message Delivery Delay This is the current delay in milliseconds (ms) imposed on each message delivery batch (applicable if the message delivery is being throttled). In regards to throttling, a delay is applied in the publishing or processing of the message, depending on whether the message is inbound or outbound. The delay period is proportional to the severity of the throttling condition. Higher severity throttling conditions will initiate a longer throttling period than lower severity throttling conditions. This delay period is adjusted up and down within certain ranges by the throttling mechanism as conditions change. The current delay period is exposed through the message delivery delay (ms) and the message publishing delay (ms) performance counters associated with the BizTalk:Message Agent performance object category. This analysis checks for a message delivery delay of greater than 5 seconds. Long message delivery delays may indicate heavy throttling due to high load.

Reference: How BizTalk Server Implements Host Throttling.
BizTalk Server: BizTalk Message Delivery Throttling State The BizTalk message delivery throttling state is one of the primary indicators of throttling. It is a flag indicating whether the system is throttling message delivery (affecting XLANG message processing and outbound transports). The throttling condition is indicated by the numeric value of the counter. For more information, see BizTalk Message Delivery Throttling State Analysis in this topic.
BizTalk Server: BizTalk Message Publishing Delay The delay injected in each qualifying batch for throttling the publishing of messages. In regards to throttling, a delay is applied in the publishing or processing of the message, depending on whether the message is inbound or outbound. The delay period is proportional to the severity of the throttling condition. Higher severity throttling conditions will initiate a longer throttling period than lower severity throttling conditions. This delay period is adjusted up and down within certain ranges by the throttling mechanism as conditions change. The current delay period is exposed through the message delivery delay (ms) and the message publishing delay (ms) performance counters associated with the BizTalk:Message Agent performance object category. This analysis checks for a message publishing delay of greater than 5 seconds. Long message delivery delays may indicate heavy throttling due to high load.

Reference: How BizTalk Server Implements Host Throttling.
BizTalk Server: BizTalk MessageBox Database Connection Failures This performance counter is the number of attempted database connections that failed since the host instance started. If the SQL Server service hosting the BizTalk databases becomes unavailable for any reason, the database cluster transfers resources from the active computer to the passive computer. During this failover process, the BizTalk Server service instances experience database connection failures and automatically restart to reconnect to the databases. The functioning database computer (previously the passive computer) begins processing the database connections after assuming the resources during failover. For more information, refer to BizTalk MessageBox Database Connection Failures Analysis in this topic.
BizTalk Server: BizTalk Messaging Latency: Request Response Latency Average latency in milliseconds from when the Messaging Engine receives a request document from the adapter until the time a response document is given back to the adapter. Refer to the chart showing how latency is measured in BizTalk Inbound Latency Analysis in this topic. Assuming a low latency environment, this analysis checks for a Request-Response Latency greater than 5 seconds. This may indicate a processing delay between the inbound adapter and the outbound adapter.

References:

- Request/Response Messaging
- Scaling Your Solutions
BizTalk Server: BizTalk Messaging Publishing Throttling State The BizTalk message publishing throttling state is one of the primary indicators of throttling. It is a flag indicating whether the system is throttling message publishing (affecting XLANG message processing and inbound transports). For more information, refer to BizTalk Messaging Publishing Throttling State Analysis in this topic.
BizTalk Server: BizTalk Orchestration Suspended/second A suspended message is an instance of a message or orchestration that BizTalk Server has stopped processing due to an error in the system or the message. Generally, suspended instances caused by system errors are resumable upon resolution of the system issue. Often, suspended instances due to a message problem are not resumable, and the message itself must be fixed and resubmitted to the BizTalk Server system. This analysis checks for any suspended messages/orchestrations.

References:

- Monitoring BizTalk Server Health and Performance

- Troubleshooting Microsoft BizTalk Server
BizTalk Server: BizTalk Orchestrations Completed/second This is the number of BizTalk orchestrations that have completed per second. This is a good indicator as to how much throughput BizTalk is processing. This analysis provides statistics only.

Reference: Scaling Your Solutions
BizTalk Server: BizTalk Orchestrations Discarded Number of orchestration instances discarded from memory since the host instance started. An orchestration can be discarded if the engine fails to persist its state. This analysis checks for any discarded messages.

Reference:
BizTalk Core Engine's WebLog
BizTalk Server: BizTalk Orchestrations Resident in Memory Number of orchestration instances currently hosted by the host instance. This analysis checks for an increasing trend in orchestrations resident in memory and whether more than 50 percent of the orchestrations resident in memory are not dehydratable. For more information, refer to BizTalk Orchestrations Resident in Memory Analysis.
BizTalk Server: BizTalk Outbound Adapter Latency This is the average latency in seconds from when the adapter gets a document from the messaging engine until the time it is sent by the adapter. Refer to the chart showing how latency is measured in BizTalk Inbound Latency Analysis in this topic. Assuming a low latency environment, this analysis checks for latency in the outbound adapter of greater than 5 seconds on average. This may indicate a processing delay in the transport of messages through outbound adapters in this host instance. If multiple outbound adapters exist in this host instance, then consider separating them into their own hosts to determine which outbound adapter has high latency.

References:

- Request/Response Messaging.
- BizTalk Server 2006: Scalability Case Study Using the SOAP Adapter in BizTalk Server 2006
- Identifying Bottlenecks in the BizTalk Tier
- Low-Latency Scenario Optimizations for BizTalk Server
BizTalk Server: BizTalk Pending Messages The number of received messages that have not been acknowledged as received to the MessageBox. Pending messages are messages that have been pulled into memory and delivered to the XLANG orchestration, but have not yet been processed. This circumstance has nothing to do with data loss. Delivering a message to an orchestration is a multi-step process and is simply an instance of the message residing in the spool table in the database. These pending messages count as in-process messages; therefore, having a large number of them in memory could cause memory throttling on the system. Adjusting the Internal Message Queue Size setting could help with controlling the number of pending messages. The In-Process Messages Per CPU setting has an impact on when throttling will invoke when a high number of pending messages occurs. These setting are found in the Host properties in the BizTalk Administration Console. This analysis checks only shows statistics for this counter.

Reference: Orchestration Engine Performance Counters.
BizTalk Server: BizTalk Persistence Points/second Average number of orchestration instances persisted per second. The orchestration engine saves the state of a running orchestration instance at various points. If it needs to rehydrate the orchestration instance, start up from a controlled shutdown, or recover from an unexpected shutdown, it will run the orchestration instance from the last persistence point. In order to persist an orchestration instance, all object instances that your orchestration refers to directly or indirectly (as through other objects) must be serializable. As message-persistence frequency (the number of times that data needs to be persisted) increases, overall performance decreases. In effect, each persistence point is a round trip to the database, so whenever possible reduce the frequency of persistence points by avoiding or consolidating persistence points. See the references below for more information regarding when persistence points occur. This analysis checks for more than 10 persistence points per second on average. This is a general starting point.

References:

- Persistence in Orchestrations
- Persistence and the Orchestration Engine
BizTalk Server: BizTalk Private Bytes This is the megabytes of allocated private memory for the host instance and comparable to the “\Process(*)\Private Bytes” performance counter. This analysis determines whether any of the host instances are consuming a large size of the system's memory and whether the host instance is increasing in memory consumption over time. Refer to BizTalk Private Bytes Analysis in this topic for more information.
BizTalk Server: BizTalk Spool Table Size The MessageBox spool table contains a record for each message in the system (active or waiting to be "garbage collected"). Monitoring the number of rows in this table and the number of messages received per second while increasing system load provides an easy way to find the maximum sustainable throughput. Simply increase the input load until either 1) the spool table starts to grow indefinitely or 2) the number of messages received per second plateaus, whichever comes first, and that is your maximum sustainable throughput. In summary, regardless of other indicators, this measure will give you a quick and easy way to assess whether your system is being overdriven or not. When the BizTalk spool tables size is on an increasing trend, then throttling due to imbalanced message delivery rate (input rate exceeds output rate) or throttling due to Database size may occur. This analysis checks for an increasing trend in the BizTalk Spool Table Size.

References:

- Understanding BizTalk Server 2004 SP1 Throughput and Capacity
- Sustainable Load Test
- Recommendations When Testing Engine Performance.
BizTalk Server: BizTalk Tracking Data Size As BizTalk Server processes more and more data on your system, the BizTalk Tracking database (BizTalkDTADb) continues to grow in size. Unchecked growth decreases system performance and may generate errors in the Tracking Data Delivery Service (TDDS). In addition to general tracking data, tracked messages can also accumulate in the MessageBox database, causing poor disk performance. This analysis checks for an increasing trend of more than 5 MB’s per hour in the tracking data size.

Reference:

Archiving and Purging the BizTalk Tracking Database
BizTalk Server: BizTalk Transactional Scopes Aborted This is the number of long-running or atomic scopes that have been aborted since the host instance started. A transactional scope abort is a failure that occurs in a transaction scope within an orchestration. It is important to understand that the compensation handler of a scope is invoked only if the scope completed successfully, but then is required to be undone because a surrounding scope has decided to abort (due to failures that may occur later in the process). Also, no "auto" rollback of state occurs in case of a transaction abort. You can achieve this outcome programmatically through the exception and compensation handlers. Transactional scope aborts should not normally occur in a production environment; therefore, this analysis checks for the occurrence of any transactional scopes aborted.

Reference:

Transactions
BizTalk Server: BizTalk Transactional Scopes Compensated Compensation can be thought of as a logical undo of the work that has been successfully committed in response to some error condition. It is important to understand that the compensation handler of a scope is invoked only if the scope completed successfully, but then is required to be undone because a surrounding scope has decided to abort (due to failures that may occur later in the process). Also, no "auto" rollback of state occurs in case of a transaction abort. You can achieve this programmatically through the exception and compensation handlers. Transactional scope compensations should not normally occur in a production environment; therefore, this analysis checks for the occurrence of any transactional scopes aborted.

Reference: Transactions
BizTalk Server: BizTalk Virtual Bytes This is the megabytes reserved for virtual memory for the host instance. This analysis determines whether any of the host instances are consuming a large amount of the system's memory and whether the host instance is increasing in memory consumption over time. For more information, refer to BizTalk Virtual Bytes Analysis in this topic.
BizTalk Server: BizTalk Message Agent Database Session Throttling This is the number of open database connections to the MessageBox compared to its respective BizTalk throttling setting. “Database connection per CPU” is the maximum number of concurrent database sessions (per CPU) allowed before throttling begins. For more information, refer to BizTalk Message Agent Database Session Throttling Analysis in this topic.
BizTalk Server: BizTalk Message Agent Database Session Throttling Threshold This is the current threshold for the number of open database connections to the MessageBox. “Database connection per CPU” is the maximum number of concurrent database sessions (per CPU) allowed before throttling begins. For more information, refer to BizTalk Message Agent Database Session Throttling Threshold Analysis in this topic.
BizTalk Server: BizTalk Message Agent In-process Message Count Throttling This is the number of concurrent messages that the service class is processing. The “In-process messages per CPU” setting in the Host Throttling Settings is the maximum number of messages delivered to the End Point Manager (EPM) or XLANG that have not been processed. For more information, refer to BizTalk Message Agent In-process Message Count Throttling Analysis in this topic.
BizTalk Server: BizTalk Message Agent In-process Message Count Throttling Threshold This is the current threshold for the number of concurrent messages that the service class is processing. The “In-process messages per CPU” setting in the Host Throttling Settings is the maximum number of messages delivered to the End Point Manager (EPM) or XLANG that have not been processed. For more information, refer to BizTalk Message Agent In-process Message Count Throttling Threshold Analysis in this topic.
BizTalk Server: BizTalk Message Agent Process Memory Usage (MB) Throttling This is the memory usage of current process (MB). BizTalk process memory throttling can occur if the batch to be published has steep memory requirements, or if too many threads are processing messages. For more information, refer to BizTalk Message Agent Process Memory Usage (MB) Throttling Analysis in this topic.
BizTalk Server: BizTalk Message Agent Process Memory Usage (MB) Throttling Threshold This is the current threshold for the memory usage of current process (MB). The threshold may be dynamically adjusted depending on the actual amount of memory available to this process and its memory consumption pattern. BizTalk process memory throttling can occur if the batch to be published has steep memory requirements, or if too many threads are processing messages. For more information, refer to BizTalk Message Agent Process memory usage (MB) Throttling Threshold Analysis in this topic.
BizTalk Server: BizTalk Message Agent Thread Count Throttling The total number of threads in the BizTalk process. “Threads Per CPU” is the total number of threads in the host process including threads used by adapters. If this threshold is exceeded, BizTalk Server will try to reduce the size of EPM thread pool and message agent thread pool. Thread based throttling should be enabled in scenarios where high load can lead to the creation of a large number of threads. This parameter affects both inbound and outbound throttling. Thread based throttling is disabled by default. This analysis checks whether the BizTalk Thread count is greater than 80 percent of the throttling threshold value indicating a throttling condition is likely.

References:

- Host Throttling Performance Counters
- How BizTalk Server Implements Host Throttling
- How to Modify the Default Host Throttling Settings
- Configuration Parameters that Affect Adapter Performance
- Threads, DB sessions, and throttling
BizTalk Server: BizTalk Message Agent Thread Count Throttling Threshold This is the current threshold for the total number of threads in the process. “Threads Per CPU” is the total number of threads in the host process including threads used by adapters. If this threshold is exceeded, BizTalk Server will try to reduce the size of EPM thread pool and message agent thread pool. Thread-based throttling should be enabled in scenarios where high load can lead to the creation of a large number of threads. This parameter affects both inbound and outbound throttling.

This analysis checks whether this throttling setting is set to a non-default value. Thread based throttling is disabled by default.

References:

- Host Throttling Performance Counters
- How BizTalk Server Implements Host Throttling
- How to Modify the Default Host Throttling Settings
- Configuration Parameters that Affect Adapter Performance
- Threads, DB sessions, and throttling

Logical/Physical Disk Read/Write Latency Analysis

The most reliable way for Windows to detect a disk performance bottleneck is by measuring its response times. If the response times are greater than .025 (25 milliseconds), which is a conservative threshold, then noticeable slow-downs and performance issues affecting users may be occurring.

Common causes of poor disk latency are disk fragmentation, performance cache, an over saturated SAN, and too much load on the disk. Use the SPA tool to help identify the top files and processes using the disk. Also check the “Process IO Data Operations/sec” and “Process IO Other Operations/sec” to see which processes are consuming the most disk I/O’s. Keep in mind that performance monitor counters are unable to specify which files are involved.

References

Logical Disk Transfers/sec

“Disk Transfers/sec” is the rate of read and write operations on the disk. While disk transfers are not a direct correlation to disk I/O's, they do tell us how many disk operations are occurring. If you average out sequential I/O’s and random I/O's, then you end up with about 80 I/O's per second as a general rule of thumb. Therefore, we should expect a SAN drive to perform more than 80 I/O's per second when under load. The thresholds for this analysis check to see whether any of the logical disks are showing poor response times (greater than 25 ms response times for I/O operations). If this is true, then we should expect the disk transfers per second to be at or above 80. If not, then the disk architecture needs to be investigated. The most common cause of poor disk I/O is logical unit number (LUN) overloading on the SAN – meaning the condition where more than one LUN is using the small physical disk array.

Available Memory Analysis

“Available Mbytes” is the amount of physical memory available to processes running on the computer, in megabytes. The Virtual Memory Manager continually adjusts the space used in physical memory and on disk to maintain a minimum number of available bytes for the operating system and processes. When available bytes are plentiful, the Virtual Memory Manager lets the working sets of processes grow, or keeps them stable by removing an old page for each new page added. When available bytes are few, the Virtual Memory Manager must trim the working sets of processes to maintain the minimum required.

This analysis checks to see whether the total available memory is low – Warning at 10 percent available and Critical at 5 percent available. A warning is also alerted when a decreasing trend of 10 MB’s per hour is detected, indicating a potential upcoming memory condition. Low physical memory can cause increased privileged mode CPU and system delays.

References

Memory Leak Detection Analysis

This analysis determines whether any of the processes are consuming a large amount of the system's memory and whether the process is increasing in memory consumption over time. A process consuming large portions of memory is okay as long as the process returns the memory back to the system. Look for increasing trends in the chart. An increasing trend over a long period of time could indicate a memory leak. Private Bytes is the current size, in bytes, of memory that this process has allocated that cannot be shared with other processes. This analysis checks for 10 MB’s per hour and 5 MB’s per hour increasing trends. Use this analysis in correlation with the Available Memory analysis.

Also, keep in mind that newly started processes will initially appear as a memory leak when it is simply normal startup behavior. A memory leak occurs when a process continues to consume memory and does not release memory over a long period of time.

If you suspect a memory leak condition, then install and use the Debug Diag tool. For more information on the Debug Diag Tool, see the references section.

Reference

Debug Diagnostic Tool

Memory Pages/sec Analysis

This analysis checks to see whether the “Pages/sec” is high. If it is high, then the system is likely running out of memory by trying to page the memory to the disk. “Pages/sec” is the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays. It is the sum of “Memory\Pages Input/sec” and “Memory\Pages Output/sec”. It is counted in numbers of pages, so it can be compared to other counts of pages, such as “Memory\Page Faults/sec”.

This counter should always be below 1,000. This analysis checks for values above 1,000. Use this analysis in correlation with Available Memory Analysis and Memory Leak Detection analysis. If all analyses are throwing alerts at the same time, then this may indicate the system is running out of memory. Follow analysis steps mentioned in Additional Information Regarding Memory Leak Detection analysis in this topic.

Reference

Ruling Out Memory-Bound Problems

Memory System Cache Resident Bytes Analysis

“System Cache Resident Bytes” is the size, in bytes, of the pageable operating system code in the file system cache. This value includes only current physical pages and does not include any virtual memory pages not currently resident. It does equal the System Cache value shown in Task Manager. As a result, this value may be smaller than the actual amount of virtual memory in use by the file system cache. This value is a component of “Memory\\System Code Resident Bytes”, which represents all pageable operating system code that is currently in physical memory. This counter displays the last observed value only; it is not an average.

This analysis checks for an increasing trend of 10 MB’s per hour. Under load, a server might use the system cache in order to cache I/O activity such as disk. Use in correlation with Process IO Data Operations/sec and Process IO Other Operations/sec analyses.

Processor Utilization Analysis and Excessive Processor Use by Processes

“% Processor Time” is the percentage of elapsed time that the processor spends to execute a non-idle thread. It is calculated by measuring the duration the idle thread is active in the sample interval, and then subtracting that time from the interval duration. (Each processor has an idle thread that consumes cycles when no other threads are ready to run.) This counter is the primary indicator of processor activity and displays the average percentage of busy time observed during the sample interval. It is calculated by monitoring the time that the service is inactive, and subtracting that value from 100 percent.

This analysis checks for utilization greater than 60 percent on each individual processor. If so, determine whether it is high user mode CPU or high privileged mode. If high privileged mode CPU is suspected, then see the “Privileged Mode CPU Analysis”. If a user-mode processor bottleneck is suspected, then consider using a process profiler to analyze the functions causing the high CPU consumption. See “How To: Identify Functions causing a High User-mode CPU Bottleneck for Server Applications in a Production Environment” article in the references section for more information.

Processor Queue Length Analysis

“Processor Queue Length” is the number of threads in the processor queue. Unlike the disk counters, this counter shows ready threads only, not threads that are running. There is a single queue for processor time even on computers with multiple processors. Therefore, if a computer has multiple processors, you need to divide this value by the number of processors servicing the workload. A sustained processor queue of less than 10 threads per processor is normally acceptable, dependent of the workload.

This analysis determines whether the average processor queue length exceeds the number of processors. If so, then this could indicate a processor bottleneck. Use this analysis in correlation with Privileged Mode CPU Analysis and Excessive Processor Use by Process. The processor queue is the collection of threads that are ready but not able to be executed by the processor because another active thread is currently executing. A sustained or recurring queue of more threads than number of processors is a good indication of a processor bottleneck.

You can use this counter in conjunction with the “Processor\% Processor Time” counter to determine whether your application can benefit from more CPUs. There is a single queue for processor time, even on multiprocessor computers. Therefore, in a multiprocessor computer, divide the “Processor Queue Length” (PQL) value by the number of processors servicing the workload

If the CPU is very busy (90 percent and higher utilization) and the PQL average is consistently higher than the number of processors, then you may have a processor bottleneck that could benefit from additional CPUs. Or, you could reduce the number of threads and queue at the application level. This will cause less context switching, which is good for reducing CPU load. The common reason for a high PQL with low CPU utilization is that requests for processor time arrive randomly, and threads demand irregular amounts of time from the processor. This means that the processor is not a bottleneck. Instead, your threading logic that needs to be improved.

If a user-mode processor bottleneck is suspected, then consider using a process profiler to analyze the functions causing the high CPU consumption. See the “How To: Identify Functions causing a High User-mode CPU Bottleneck for Server Applications in a Production Environment” article in the references section for more information.

Privileged Mode CPU Analysis

This counter indicates the percentage of time a thread runs in privileged mode. When your application calls operating system functions (for example to perform file or network I/O or to allocate memory), these operating system functions are executed in privileged mode.

High privileged mode CPU indicates that the computer is spending too much time in system I/O versus real (user mode) work. “% Privileged Time” is the percentage of elapsed time that the process threads spent executing code in privileged mode. When a Windows system service is called, the service will often run in privileged mode to gain access to system-private data. Such data is protected from access by threads executing in user mode. Calls to the system can be explicit or implicit, such as page faults or interrupts. Unlike some early operating systems, Windows uses process boundaries for subsystem protection in addition to the traditional protection of user and privileged modes. Some work done by Windows on behalf of the application might appear in other subsystem processes in addition to the privileged time in the process.

This analysis checks to see whether privileged mode CPU is consuming more than 30 percent of total CPU. If so, then the CPU consumption is likely caused by another bottleneck other than the processor such as network, memory, or disk I/O. Use in correlation with % Interrupt Time and High Context Switching analyses.

High Context Switching Analysis

A context switch happens when a higher priority thread preempts a lower priority thread that is currently running or when a high priority thread blocks. High levels of context switching can occur when many threads share the same priority level. This often indicates that too many threads are competing for the processors on the system. If you do not see much processor utilization and you see very low levels of context switching, it could indicate that threads are blocked.

High context switching should only be investigated when privileged mode CPU and overall CPU is high. As a general rule, context switching rates of less than 5,000 per second per processor are not worth worrying about. If context switching rates exceed 15,000 per second per processor, then there is a constraint.

This analysis checks for high CPU, high privileged mode CPU, and high (greater than 5,000 per processor) system context switches per second all occurring at the same time. If high context switching is occurring, then reduce the number of threads and processes running on the system.

BizTalk High Database Sessions Analysis

This counter has two possible values namely normal (0) or exceeded (1). This analysis checks for a value of 1. If so, BizTalk has exceeded the threshold of the number of database sessions permitted. This value is controlled by the “Database connection per CPU” value in the BizTalk Host Throttling settings.

“Database connection per CPU” is the maximum number of concurrent database sessions (per CPU) allowed before throttling begins. The idle database sessions in the common per-host session pool do not add to this count, and this check is made strictly on the number of sessions actually being used by the host instance. This option is disabled by default; typically, this setting should only be enabled if the database server is a bottleneck or for low-end database servers in the BizTalk Server system. You can monitor the number of active database connections by using the database session performance counter under the BizTalk:Message Agent performance object category. This parameter only affects outbound message throttling. Enter a value of 0 to disable throttling that is based on the number of database sessions. The default value is 0.

Note

The “MaxWorkerThreads” registry key influences the number threads available to BizTalk and may help if most of the BizTalk threads are busy with database connections.

References

BizTalk High Database Size Analysis

This counter refers to the number of messages in the database queues that this process has published. This value is measured by the number of items in the queue tables for all hosts and the number of items in the spool and tracking tables. Queue includes the work queue, the state queue and the suspended queue. If a process is publishing to multiple queues, this counter reflects the weighted average of all the queues.

If the host is restarted, statistics held in memory are lost. Since some overhead is involved, BizTalk Server will resume gathering statistics only when there are at least 100 publishes, with 5 percent of the total publishes within the restarted host process.

This counter will be set to a value of 1 if either of the conditions listed for the message count in database threshold occurs. This threshold is documented in the topic How to Modify the Default Host Throttling Settings referenced below. By default the host message count in database throttling threshold is set to a value of 50,000, which will trigger a throttling condition under the following circumstances:

  • The total number of messages published by the host instance to the work, state, and suspended queues of the subscribing hosts exceeds 50,000.

  • The number of messages in the spool table or the tracking table exceeds 500,000 messages.

    Since suspended messages are included in the message count in database calculation, throttling of message publishing can occur even if the BizTalk server is experiencing low or no load.

    This analysis checks for a value of 1. If this occurs, then consider a course of action that will reduce the number of messages in the database. For example, ensure the BizTalk SQL Server jobs are running without error and use the Group Hub in the BizTalk Administration console to determine whether message build up is caused by large numbers of suspended messages.

References

BizTalk High In-Process Message Count Analysis

The “In-process messages per CPU” setting in the Host Throttling Settings is the maximum number of messages delivered to the End Point Manager (EPM) or XLANG that have not been processed. This number does not include the messages retrieved from database but still waiting for delivery in the in-memory queue. You can monitor the number of in-Process Messages by using the In-process message count performance counter under the BizTalk:Message Agent performance object category. This parameter provides a hint to the throttling mechanism when considering throttling conditions. The actual threshold is subject to self-tuning. You can verify the actual threshold by monitoring the in-process message count performance counter.

This parameter can be set to a smaller value for large message scenarios, where either the average message size is high, or the processing of messages may require a large number of messages. This would be evident if a scenario experiences memory-based throttling too often and if the memory threshold gets auto-adjusted to a substantially low value. Such behavior would indicate that the outbound transport should process fewer messages concurrently to avoid excessive memory usage. Also, for scenarios where the adapter is more efficient when processing a few messages at a time (for example, when sending to a server that limits concurrent connections), this parameter may be tuned to a lower value than the default.

This analysis checks the High In-Process Message Count counter to determine whether this kind of throttling is occurring. If so, consider adjusting the “In-Process messages per CPU” setting. This parameter only affects outbound message throttling. Enter a value of 0 in the “In-Process messages per CPU” setting to disable throttling based on the number of in-process messages per CPU. The default value for the “In-Process messages per CPU” setting is 1,000. Note that modifying this value can also have an impact on low latency of messages and/or the efficiency of BizTalk resources.

References

BizTalk High Message Delivery Rate Analysis

For outbound (delivered) messages, BizTalk Server throttles delivery of messages if the message delivery incoming rate for the host instance exceeds the message delivery outgoing rate * the specified rate overdrive factor (percent) value. The rate overdrive factor (percent) parameter is configurable on the Message Processing Throttling Settings dialog box. Rate-based throttling for outbound messages is accomplished primarily by inducing a delay before removing the messages from the in-memory queue and delivering the messages to the End Point Manager (EPM) or orchestration engine for processing. No other action is taken to accomplish rate-based throttling for outbound messages.

Outbound throttling can cause delayed message delivery and messages may build up in the in-memory queue and cause de-queue threads to be blocked until the throttling condition is mitigated. When de-queue threads are blocked, no additional messages are pulled from the MessageBox into the in-memory queue for outbound delivery.

This analysis checks for a value of 1 in the High Message Delivery Rate counter. High message delivery rates can be caused by high processing complexity, slow outbound adapters, or a momentary shortage of system resources.

References

BizTalk High Process Memory Analysis

The BizTalk Process Memory usage throttling threshold setting is the percentage of memory used compared to the sum of the working set size and total available virtual memory for the process if a value from 1 through 100 is entered. By default, the BizTalk Process Memory Usage throttling setting is 25. When a percentage value is specified the process memory threshold is recalculated at regular intervals. If the user specifies a percentage value, it is computed based on the available memory to commit and the current Process Memory usage.

This analysis checks for a value of 1 in the High Process Memory counter. If this occurs, then try to determine the cause of the memory increase by using Debug Diag (see references in Memory Leak Detection analysis). Note that is it normal for processes to consume a large portion of memory during startup and this may initially appear as a memory leak, but a true memory leak occurs when a process fails to release memory that it no longer needs, thereby reducing the amount of available memory over time. See the “How to Capture a Memory Dump of a Process that is Leaking Memory” reference below and/or the “Memory Leak Detection” analysis in PAL for more information on how to generically analyze process memory leaks in BizTalk.

High process memory throttling can occur if the batch to be published has steep memory requirements, or too many threads are processing messages. If the system appears to be over-throttling, consider increasing the value associated with the process memory usage threshold for the host and verify that the host instance does not generate an "out of memory" error. If an "out of memory" error is raised by increasing the process memory usage threshold, then consider reducing the values for the internal message queue size and In-process messages per CPU thresholds. This strategy is particularly relevant in large message processing scenarios. In addition, this value should be set to a low value for scenarios having large memory requirement per message. Setting a low value will kick in throttling early on and prevent a memory explosion within the process.

If your BizTalk server regularly runs out of virtual memory, then consider BizTalk Server 64-bit. Each process on 64-bit servers can address up to 4 TB’s of virtual memory versus the 2 GB’s in 32-bit. In general, 64-bit BizTalk and 64-bit SQL Server is highly recommended. See the “BizTalk Server 64-bit Support” reference for more information.

References

BizTalk High System Memory Analysis

The BizTalk Physical Memory usage throttling threshold setting is the percentage of memory consumption compared to the total amount of available physical memory if a value from 1 through 100 is entered. This setting can also be the total amount of available physical memory in megabytes if a value greater than 100 is entered. Enter a value of 0 to disable throttling based on physical memory usage. The default value is 0.

This analysis checks for a value of 1 in the High System Memory counter. Since this measures total system memory, a throttling condition may be triggered if non-BizTalk Server processes are consuming an extensive amount of system memory. Therefore if this occurs, the best approach is to identify which processes are consuming the most physical memory and/or add additional physical memory to the server. Also, consider reducing load by reducing the default size of the EPM thread pool, and/or the size of adapter batches. For more information, see the Memory Leak Detection analysis in PAL.

References

BizTalk High Thread Count Analysis

“Threads Per CPU” is the total number of threads in the host process including threads used by adapters. If this threshold is exceeded, BizTalk Server will try to reduce the size of the EPM thread pool and message agent thread pool. Thread based throttling should be enabled in scenarios where high load can lead to the creation of a large number of threads. This parameter affects both inbound and outbound throttling. Thread based throttling is disabled by default.

Note

The user-specified value is used as a guideline, and the host may dynamically self-tune this threshold value based on the memory usage patterns and thread requirements of the process.

This analysis checks for a value of 1 in the High Thread Count counter. Consider adjusting the different thread pool sizes to ensure that the system does not create a large number of threads. This analysis can be correlated with Context Switches per Second analysis to determine whether the operating system is saturated with too many threads, but in most cases high thread counts cause more contention on the backend database than on the BizTalk server. For more information about modifying the thread pool sizes see How to Modify the Default Host Throttling Settings in references.

References

1 2 3 4 5 6
Adapter receives message and submits it to the engine, work done in adapter before message is given to engine not captured in these perf counters Engine receives message from adapter, executes receive pipeline, map, subscription evaluation, persist message in DB. Orchestration or Solicit-Response port runs and generates a response message. Response message is dequeued in messaging engine, execute the send pipeline, map. Messaging engine gives response message to adapter. Adapter informs engine message is all done.
I
RR RR RR
O O O
OA OA

I = Inbound Latency

RR = Request Response Latency

O = Outbound Latency

OA = Outbound Adapter Latency

Assuming a low latency environment, this analysis checks whether the document spent more than 5 seconds in the inbound adapter. This may indicate a processing delay in the transport of messages through inbound adapters in this host instance. If multiple inbound adapters exist in this host instance, then consider separating them into their own hosts to determine which inbound adapter has high latency.

References

BizTalk Message Delivery Throttling State Analysis

The BizTalk message delivery throttling state is one of the primary indicators of throttling. It is a flag indicating whether the system is throttling message delivery (affecting XLANG message processing and outbound transports). The throttling condition is indicated by the numeric value of the counter. Here is a list of the values and their respective meaning:

Throttling condition Description
0 Not throttling
1 Throttling due to imbalanced message delivery rate (input rate exceeds output rate)
3 Throttling due to high in-process message count
4 Throttling due to process memory pressure
5 Throttling due to system memory pressure
9 Throttling due to high thread count
10 Throttling due to user override on delivery

This analysis checks for each of these values and has a specific alert for each of them.

References

BizTalk MessageBox Database Connection Failures Analysis

This performance counter is the number of attempted database connections that failed since the host instance started. If the SQL Server service hosting the BizTalk databases becomes unavailable for any reason, the database cluster transfers resources from the active computer to the passive computer. During this failover process, the BizTalk Server service instances experience database connection failures and automatically restart to reconnect to the databases. The functioning database computer (previously the passive computer) begins processing the database connections after assuming the resources during failover.

DBNetLib (Database Network Library) errors occur when the BizTalk Server runtime is unable to communicate with either the MessageBox or management databases. When this occurs, the BizTalk Server runtime instance that catches the exception shuts down and then cycles every minute to check to see whether the database is available. See the references section for more information on this topic.

When a client initiates a TCP/IP socket connection to a server, the client typically connects to a specific port on the server and requests that the server respond to the client over an ephemeral, or short lived, TCP or UDP port. On Windows Server 2003 and Windows XP, the default range of ephemeral ports used by client applications is from 1025 through 5000. Under certain conditions it is possible that the available ports in the default range will be exhausted. See the references section for more information on this topic.

This analysis checks for any occurrence of database connection failures. Database connection failures are critical because BizTalk cannot function without the database. If the cause of the database connection failure is unknown, then consider the references listed below and/or contact Microsoft Support to determine the nature of the connectivity failure.

References

BizTalk Messaging Publishing Throttling State Analysis

The BizTalk message publishing throttling state is one of the primary indicators of throttling. It is a flag indicating whether the system is throttling message publishing (affecting XLANG message processing and inbound transports).The throttling condition is indicated by the numeric value of the counter. Here is a list of the values and their respective meaning:

Throttling condition Description
0 Not throttling
2 Throttling due to imbalanced message publishing rate (input rate exceeds output rate)
4 Throttling due to process memory pressure
5 Throttling due to system memory pressure
6 Throttling due to database growth
8 Throttling due to high session count
9 Throttling due to high thread count
11 Throttling due to user override on publishing

This analysis checks for each of these values and has a specific alert for each of them.

References

BizTalk Orchestrations Resident in Memory

Number of orchestration instances currently hosted by the host instance. While spikes or bursts of orchestrations resident in memory may be considered normal, an increasing trend could indicate a “pile up” of orchestrations in memory. An increasing trend over time may occur when BizTalk is unable to dehydrate messages/orchestration instances. Try to correlate this counter with “XLANG/s Orchestrations(?)\Dehydratable orchestrations” where the question mark (?) is the same counter instance as this counter.

If a high number of orchestrations are resident in memory and if a low number of orchestrations are dehydratable, then your orchestrations are likely idle in memory and may cause a memory leak condition. Use this analysis in correlation with “\XLANG/s Orchestrations(*)\Idle orchestrations” if present. An increasing trend in BizTalk Idle Orchestrations is a better indicator of memory leaks due to the inability to dehydrate orchestration instances.

This analysis checks for an increasing trend in orchestrations resident in memory and if more than 50% of the orchestrations resident in memory are not dehydratable.

References

BizTalk Private Bytes Analysis

This is the megabytes of allocated private memory for the host instance and comparable to the “\Process(*)\Private Bytes” performance counter. Private Bytes is the current size, in bytes, of memory that a process has allocated that cannot be shared with other processes. This analysis determines whether any of the host instances are consuming a large size of the system's memory and whether the host instance is increasing in memory consumption over time. A host instance consuming large portions of memory is fine as long as the it returns the memory to the system. Look for increasing trends in the chart. An increasing trend over a long period of time could indicate a memory leak.

This analysis checks for a 10 MB-per-hour increasing trend. Use this analysis in correlation with the Available Memory analysis and the Memory Leak analysis. Also, keep in mind that newly started host instances will initially appear as a memory leak when it is simply normal start up behavior. A memory leak is when a process continues to consume memory and not releasing memory over a long period of time. If you suspect a memory leak condition, then read the “Memory Growth in BizTalk Messaging” article referenced below. Otherwise, install and use the Debug Diag tool. For more information on the Debug Diag Tool, see the references section.

References

BizTalk Virtual Bytes Analysis

This is the megabytes reserved for virtual memory for the host instance. This analysis determines whether any of the host instances are consuming a large amount of the system's memory and whether the host instance is increasing in memory consumption over time. A host instance consuming large portions of memory is fine as long as it returns the memory to the system. Look for increasing trends in the chart. An increasing trend over a long period of time could indicate a memory leak.

This analysis checks for a 10 MB-per-hour increasing trend in virtual bytes. Use this analysis in correlation with the Available Memory analysis and the Memory Leak analysis. Also, keep in mind that newly started host instances will initially appear as a memory leak when it is simply normal start up behavior. A memory leak is when a process continues to consume memory and not releasing memory over a long period of time. If you suspect a memory leak condition, then read the “Memory Growth in BizTalk Messaging” article below. Otherwise, install and use the Debug Diag tool. For more information on the Debug Diag Tool, see the references section.

References

BizTalk Message Agent Database Session Throttling Analysis

This is the number of open database connections to the MessageBox compared to its respective BizTalk throttling setting. “Database connection per CPU” is the maximum number of concurrent database sessions (per CPU) allowed before throttling begins. The idle database sessions in the common per-host session pool do not add to this count, and this check is made strictly on the number of sessions actually being used by the host instance. This option is disabled by default; typically this setting should only be enabled if the database server is a bottleneck in the BizTalk Server system. You can monitor the number of active database connections by using the database session performance counter under the BizTalk:Message Agent performance object category. This parameter only affects outbound message throttling. Enter a value of 0 to disable throttling that is based on the number of database sessions. The default value is 0.

The MaxWorkerThreads registry key has influence on the number threads available to BizTalk and may help in the case where most of BizTalk’s threads are busy with database connections. This analysis checks whether the number of open database connections to the MessageBox are greater than 80 percent of the Database Session Throttling setting, indicating a throttling condition is likely.

References

BizTalk Message Agent Database Session Throttling Threshold Analysis

This is the current threshold for the number of open database connections to the MessageBox. “Database connection per CPU” is the maximum number of concurrent database sessions (per CPU) allowed before throttling begins. The idle database sessions in the common per-host session pool do not add to this count, and this check is made strictly on the number of sessions actually being used by the host instance. This option is disabled by default; typically this setting should only be enabled if the database server is a bottleneck in the BizTalk Server system. You can monitor the number of active database connections by using the database session performance counter under the BizTalk:Message Agent performance object category. This parameter only affects outbound message throttling. Enter a value of 0 to disable throttling that is based on the number of database sessions. The default value is 0.

The MaxWorkerThreads registry key has influence on the number threads available to BizTalk and may help in the case where most of BizTalk’s threads are busy with database connections. This analysis checks this value to see whether it has been modified from its default setting. By default, this setting is 0, which means throttling on database sessions is disabled.

References

BizTalk Message Agent In-process Message Count Throttling Analysis

This is the number of concurrent messages that the service class is processing. The “In-process messages per CPU” setting in the Host Throttling Settings is the maximum number of messages delivered to the End Point Manager (EPM) or XLANG that have not been processed. This does not include the messages retrieved from the database but still waiting for delivery in the in-memory queue. You can monitor the number of in-process messages by using the In-process message count performance counter under the BizTalk:Message Agent performance object category. This parameter provides a hint to the throttling mechanism for consideration of throttling conditions. The actual threshold is subject to self-tuning. You can verify the actual threshold by monitoring the In-process message count performance counter.

For large message scenarios (where either the average message size is high, or the processing of messages may require a large amount of messages), this parameter can be set to a smaller value. A large message scenario is indicated if memory-based throttling occurs too often and if the memory threshold gets auto-adjusted to a substantially low value. Such behavior would indicate that the outbound transport should process fewer messages concurrently to avoid excessive memory usage. Also, for scenarios where the adapter is more efficient when processing a few messages at a time (for example, when sending to a server that limits concurrent connections), this parameter may be tuned to a lower value than the default. This analysis checks the High In-Process Message Count counter to determine whether it is greater than 80 percent of its throttling setting under the same name, which indicates a throttling condition is likely.

References

BizTalk:Message Agent In-process Message Count Throttling Threshold Analysis

This is the current threshold for the number of concurrent messages that the service class is processing. The “In-process messages per CPU” setting in the Host Throttling Settings is the maximum number of messages delivered to the End Point Manager (EPM) or XLANG that have not been processed. This does not include the messages retrieved from the database but still waiting for delivery in the in-memory queue. You can monitor the number of in-process messages by using the In-process message count performance counter under the BizTalk:Message Agent performance object category. This parameter provides a hint to the throttling mechanism for consideration of throttling conditions. The actual threshold is subject to self-tuning. You can verify the actual threshold by monitoring the In-process message count performance counter.

For large message scenarios (where either the average message size is high, or the processing of messages may require a large amount of messages), this parameter can be set to a smaller value. A large message scenario is indicated if memory-based throttling occurs too often and if the memory threshold gets auto-adjusted to a substantially low value. Such behavior would indicate that the outbound transport should process fewer messages concurrently to avoid excessive memory usage. Also, for scenarios where the adapter is more efficient when processing a few messages at a time (for example, when sending to a server that limits concurrent connections), this parameter may be tuned to a lower value than the default. This analysis checks the High In-Process Message Count throttling threshold for a non-default value.

References

BizTalk Message Agent Process Memory Usage (MB) Throttling Analysis

This is the memory usage of current process (MB). BizTalk process memory throttling can occur if the batch to be published has steep memory requirements, or if too many threads are processing messages. If the system appears to be over-throttling, consider increasing the value associated with the process memory usage threshold for the host and verify that the host instance does not generate an "out of memory" error. If an "out of memory" error is raised by increasing the process memory usage threshold, then consider reducing the values for the internal message queue size and in-process messages per CPU thresholds. This strategy is particularly relevant in large message processing scenarios.

If your BizTalk server regularly runs out of virtual memory, then consider BizTalk Server 64-bit. Each Process on 64-bit servers can address up to 4 TB’s of virtual memory versus the 2 GB’s in 32-bit. In general, 64-bit BizTalk and 64-bit SQL Server is highly recommended. See the “BizTalk Server 64-bit Support” reference for more information. This analysis checks whether the process memory usage is greater than 80 percent of its respective throttling threshold of the same name. By default, the BizTalk Process Memory Usage throttling setting is 25 percent of the virtual memory available to the process. The /3GB switch has no effect on this setting.

References

BizTalk:Message Agent Process Memory Usage (MB) Throttling Threshold Analysis

This is the current threshold for the memory usage of current process (MB). The threshold may be dynamically adjusted depending on the actual amount of memory available to this process and its memory consumption pattern. BizTalk process memory throttling can occur if the batch to be published has steep memory requirements, or if too many threads are processing messages. If the system appears to be over-throttling, consider increasing the value associated with the process memory usage threshold for the host and verify that the host instance does not generate an "out of memory" error. If an "out of memory" error is raised by increasing the process memory usage threshold, then consider reducing the values for the internal message queue size and in-process messages per CPU thresholds. This strategy is particularly relevant in large message processing scenarios.

If your BizTalk server regularly runs out of virtual memory, then consider BizTalk Server 64-bit. Each Process on 64-bit servers can address up to 4 TB’s of virtual memory versus the 2 GB’s in 32-bit. In general, 64-bit BizTalk and 64-bit SQL Server is highly recommended. See the “BizTalk Server 64-bit Support” reference for more information. This analysis checks whether the Process memory throttling is set to a non-default value.

References