SOA Service Configuration Files in Windows HPC Server 2008 R2

Updated: February 2011

Applies To: Windows HPC Server 2008 R2

This topic describes common settings that you can configure in the service registration file for SOA services that you install on a Windows® HPC Server 2008 R2 cluster. The service configuration file is an XML file that is used to register a service on the cluster and configure service behavior. For example, broker behavior such as monitoring, message throttling, and load balancing can be defined at the individual service level.

At a minimum, the file must specify the registration information for the service, such as the path to the DLL for the service. The service configuration file must be named servicename.config, where the servicename is the same as that passed into the SessionStartInfo constructor.

Microsoft® HPC Pack 2008 R2 includes a sample service that is named CcpEchoSvc. CcpEchoSvc is a simple, built-in service that can be used with the diagnostics tests to verify SOA functionality on the cluster. You can also use the service configuration file for CcpEchoSvc as an example when you create or modify a configuration file.

Note
Cluster administrators must have write permissions to the service configuration files to make changes to the service registration and configuration settings. For information about deploying the configuration file to the cluster, see Deploy the SOA Service DLLs to a Windows HPC 2008 R2 Cluster.

This topic includes the following sections:

  • Service registration settings

  • Broker settings

  • SOA service versioning

Service registration settings

You provide the registration information for a service inside the <microsoft.Hpc.Session.ServiceRegistration> section which contains the <service> element.

For example, the following XML code shows how the service registration settings are defined in the service configuration file for CcpEchoSvc (the built-in sample service in Windows HPC Server 2008 R2).

  <microsoft.Hpc.Session.ServiceRegistration>
    <service assembly="%CCP_HOME%bin\EchoSvcLib.dll"
             contract="EchoSvcLib.IEchoSvc"
             type="EchoSvcLib.EchoSvc"
             includeExceptionDetailInFaults="true"
             maxConcurrentCalls="0"
             maxMessageSize="65536"
             serviceInitializationTimeout="60000" >
      <!--Below is a sample for adding environment variables to the service-->
      <environmentVariables>
        <add name="myname1" value="myvalue1"/>
        <add name="myname2" value="myvalue2"/>
      </environmentVariables>
    </service>

The following table describes the service registration attributes:

<service> Attribute Description Default value

assembly

The full path to the service DLL. This can be a path to a shared folder on the cluster, or it can be a local path on each compute node.

For example:

C:\Services\yourServiceName.dll

Required.

None

contract

The interface of the service (WCF contract).

Optional if there is only one interface in the service DLL.

None

type

The class that implements the WCF contract.

Optional if there is only one interface in the service DLL.

None

architecture

The architecture on which your WCF service can run. The possible values are x86 and x64 (any other value is ignored, and the default is used).

Optional.

X64

includeExceptionDetailInFaults

If true, returns managed exception information to the client. This information can be helpful during development to troubleshoot a service. Setting this property to true is not recommended for production environments.

Warnung
Returning managed exception information to clients can be a security risk because exception details expose information about the internal service implementation that could be used by unauthorized clients.

For more information, see IncludeExceptionDetailInFaults Property.

false

maxConcurrentCalls

The maximum number of messages actively processing across a service host.

A value of 0 indicates that the maximum should be automatically calculated based on the service capacity (number of CPU cores) of each service host.

Optional.

0

maxMessageSize

The maximum size of the message file, in bytes.

For durable sessions, the upper limit for message size is 4MB because of the limitation of MSMQ.

Optional.

65,536 bytes

serviceInitializationTimeout

The maximum length of time that the broker attempts to load the service on a service host.

If an EndPointNotFound exception is returned when connecting to the service host, the broker waits the amount of time specified by endpointNotFoundRetryPeriod (in the broker settings) and then tries again.

If the broker cannot connect successfully to the service, the broker adds the node to the excluded nodes list for the job. When the excluded nodes limit is reached, the session fails.

Optional.

60,000 milliseconds

environmentVariables

Environment variables used by your service.

Optional.

None

Broker settings

You can customize how the broker interacts with a service by defining the broker monitoring settings in the service configuration file. This allows you to define broker behavior such as monitoring, message throttling, and load balancing at the individual service level.

The <Microsoft.Hpc.Broker> section contains the <monitor> element and the <loadBalancing> element. The attributes in these elements determine how the broker handles sessions for the service. The broker uses the default setting for any attribute that is not specified.

A subset of the broker monitoring settings can also be defined by using the SessionStartInfo.BrokerSettingsInfo class when the client application creates a session. Values provided in the SessionStartInfo.BrokerSettingsInfo class override the values that are specified in the service configuration file.

Monitoring settings

The following XML code shows how the broker monitoring attributes are defined in the service configuration file for CcpEchoSvc.

    <monitor messageThrottleStartThreshold="4096"
             messageThrottleStopThreshold="3072"
             loadSamplingInterval="1000"
             allocationAdjustInterval="30000"
             clientIdleTimeout="300000"
             sessionIdleTimeout="300000"
             statusUpdateInterval="15000"
             clientBrokerHeartbeatInterval="20000"
             clientBrokerHeartbeatRetryCount="3" />

The following table describes the broker monitoring attributes and their default values:

<monitor> Attribute Description Default values

messageThrottleStartThreshold

messageThrottleStopThreshold

messageThrottleStartThreshold specifies the number of requests that the broker accepts before the broker stops listening and waits for space to become available in the broker queue (as requests are calculated).

messageThrottleStopThreshold specifies the number of requests that in the queue to make the broker stop throttling. This number is lower than the start threshold because the broker must have adequate space in its queue before it starts listening again.

The amount of memory that the broker queue requires is (messageThrottleStartThreshold) * (sizeOfRequest). If you have throttling limits set too high and your messages are too big, then the broker node can run out of memory. For example, if you set the limit to 400,000, and you expect 64 KB messages, the broker would require 25.6 GB of memory to hold the request queue.

If the throttle settings are too low, nodes can be underutilized due to insufficient message throughput.

Throttle start: 4,096

Throttle stop: 3,072

loadSamplingInterval

The amount of time between load sampling passes. During a load sampling pass, the broker checks the request queue to update performance counters for a session, including the number of waiting requests.

You can increase this number to minimize management overhead. You might want to decrease this number if the service processes very short-running requests.

Unit: milliseconds

1,000 milliseconds

(1 second)

allocationAdjustInterval

The amount of time between allocation adjustment passes. During an adjustment pass, the broker updates job properties and evaluates the Target Resource Count for a session based on the load sampling metrics gathered within the adjust interval. The job scheduler uses the Target Resource Count property to help determine how many service hosts to allocate to the session.

Automatic allocation adjustment can help increase cluster utilization. For example, the job scheduler can shrink an idle session to its minimum resources. However, it can take time for the resources to grow when the session becomes active again.

To disable allocation adjustment, you can set this property to a value of -1.

Unit: milliseconds

30,000 milliseconds

(30 seconds)

clientIdleTimeout

After a client application connects to a session, if the client does not have any activity and there are no pending requests for the client within the specified amount of time, the broker closes the connection to the client, but it does not close the session.

Unit: milliseconds

300,000 milliseconds

(5 minutes)

sessionIdleTimeout

When all the clients for the session are idle (timed out), if no more client applications are connected within this timeout period, the broker closes the session.

You can extend this period if users want to keep a session open and send several batches of requests within the same session. This avoids the start-up time of creating a new session and receiving new resources.

With an HTTP binding, sessionIdleTimeout does not start counting until after clientIdleTimout triggers.

Unit: milliseconds

300,000 milliseconds

(5 minutes)

statusUpdateInterval

The timer interval for the broker to publish service statistics to the job.

Unit: milliseconds

15,000 milliseconds

(15 seconds)

clientBrokerHeartbeatInterval

clientBrokerHeartbeatRetryCount

These setting apply to clients of type BrokerClient.

The instance of BrokerClient monitors the connection to the HPC Broker Service on the broker node. The broker client sends regular heartbeat probes to the broker service. The clientBrokerHeartbeatInterval specifies the number of milliseconds between probes.

If the broker service does not respond, the client retries up to the number of times specified by the clientBrokerHeartbeatRetryCount. After the retry count is reached, a SessionException is thrown.

In some cases, the client might try to create a new session (up to the job and task retry counts that are set for the HPC Job Scheduler Service).

Interval: 20,000 milliseconds (20 seconds)

Retry count: 3

Load balancing settings

The following XML code shows how the broker load balancing attributes are defined in the service configuration file for CcpEchoSvc.

    <loadBalancing messageResendLimit="3"
                   serviceRequestPrefetchCount="1"
                   serviceOperationTimeout="86400000"
                   endpointNotFoundRetryPeriod="300000"/>

The following table describes the broker load balancing attributes and their default values:

<loadBalancing> Attribute Description Default values

messageResendLimit

The number of times that the broker will resend a message.

Sessions that are running on Azure Worker Nodes are more likely to reach the message resend limits than sessions that are running on on-premise nodes, especially if the HPC Job Scheduler Service is running in Balanced mode. If you see messages fail on your Azure Worker Nodes, increase the message resend limit.

3

serviceRequestPrefetchCount

The number of requests that can be pending for each service instance. For example, if the prefetch count is set to 1, two requests are sent to each service instance. One request starts processing immediately, and the second ‘prefetched’ request starts processing when the first requests is finished.

If the count is too low, nodes will be idle as they wait for incoming requests (for example, 0 or a small number for very short running requests). If the count is too high, then some nodes might become idle while there are unprocessed requests waiting on other nodes.

1

serviceOperationTimeout

The length of time that the broker waits for the service to finish processing the message.

86,400,000 milliseconds

(24 hours)

endpointNotFoundRetryPeriod

If an EndPointNotFound exception is returned when connecting to the service host, the broker waits the amount of time specified by endpointNotFoundRetryPeriod and then tries again.

30,000 milliseconds

(30 seconds)

SOA service versioning

A SOA service can opt into versioning by appending a version number to the service configuration file name. The expected format is: servicename_major.minor.config. The version must include the major and minor portions of the version identifier and no further sub-versions. For example, the configuration file for version 1.0 of Microsoft.Hpc.Excel.ExcelService service is named Microsoft.Hpc.Excel.ExcelService_1.0.config.

The SOA client application can select which version of a service to use by specifying the SessionStartInfo.ServiceVersion property when creating the session. If the client does not specify a version, the latest installed version is used and returned in the SessionBase.ServiceVersion property after the session is created. If the specified version is not found on the cluster, the cluster returns a session exception error message.

SOA client applications that support multiple versions of a SOA service can use the SessionBase.GetServiceVersions method to check the versions of the service that are available on the HPC cluster before using version-specific features of the service. The client application can programmatically select a version, or present the options to the user.

Additional references