How HPC Services for Excel Work
Applies To: Windows HPC Server 2008 R2
This topic explains how HPC Services for Excel enables calculation offloading to a Windows® HPC Server 2008 R2 cluster. HPC Services for Excel uses a service-oriented architecture (SOA) infrastructure to run Excel jobs on a cluster. HPC Services for Excel includes ready-made SOA clients and services that enable developers to quickly convert workbooks to run on a cluster.
In this topic:
Service oriented architecture (SOA) is an approach to building distributed, loosely coupled systems. In a SOA system, distinct computational functions are packaged as software modules called services. Services can be distributed across a network and accessed by other applications. For example, if applications perform repeated parallel calculations, the core calculations can be packaged as services and deployed to a cluster. This allows developers to solve embarrassingly parallel problems without rewriting the low-level code and to rapidly scale out applications. Applications can run faster by distributing core calculations across multiple service hosts (compute nodes). End-users run the application on their computers, and cluster nodes perform calculations.
A client application provides an interface for the end-user to access the functionality of one or more services. Developers can create cluster-SOA client applications to provide access to services that are deployed to a Windows HPC cluster. On the back end, the client application submits a job that contains a Service task to the cluster, initiates a session with the broker node, and sends service requests and receives responses (calculation results). The job scheduler on the head node allocates resources to the service job according to the job scheduling policies. An instance of the Service task runs on each allocated resource and loads the SOA service. The job scheduler tries to adjust resource allocation based on the number of service requests.
The following diagram illustrates how a SOA job runs on the cluster:
Microsoft Excel 2010 extends the UDF model to the cluster by enabling UDFs to run in a Windows HPC cluster. In the cluster, the UDF calculation is performed by one or more cluster nodes. If a workbook contains calls to long-running UDFs, multiple servers can be used to evaluate functions simultaneously. To run on the cluster, the UDFs must be registered as cluster-safe, and they must be contained in an XLL file. The cluster administrator then deploys the XLL and its dependencies to the cluster.
To support UDF offloading, HPC Services for Excel include a built-in client (Excel Cluster Connector) and two built-in XLL container services (for 32-bit and 64-bit XLLs).
The Excel Cluster Connector is an add-in to Excel 2010 that acts as a proxy between Excel and the cluster. The Excel user can specify the cluster name and other optional job submission parameters. When the user submits the first calculation request from Excel, the Excel Cluster Connector creates a job that includes a Service task. The connector submits the job to the cluster with the Excel user’s credentials and starts a session with a broker node. The connector also specifies which XLL files to load. When the job starts running, the Excel client begins sending requests and receiving calculation results through the broker node. The session remains open until the user closes Excel, or until the session times out. For information about configuring the connector, see Using the Excel Cluster Connector to Offload UDFs.
On the cluster, the Service task runs on each allocated resource and invokes the XLL container service. The XLL container service loads the XLLs, invokes the UDFs, and returns the calculation results. If the calculation request originates from a 32-bit Excel client, the connector creates a task that uses the 32-bit container service. If it originates from a 64-bit client, the connector creates a task that uses the 64-bit container service.
The following diagram illustrates how UDF offloading works on a cluster:
Many long-running workbooks perform calculations iteratively (that is, the same calculation runs many times over different sets of input data). To enable a workbook to run in parallel on a cluster, an Excel developer can use the HPC macro framework to define how to partition independent calculations in a workbook and then merge the results.
To support workbook offloading, the HPC Services for Excel includes a built-in SOA client (Excel Client) and a built-in service (Excel Service). The HPC macro framework integrates with Excel Client.
Developers provide an interface that allows the Excel user to specify the cluster name, the workbook location, and any additional job submission specifications. Developers include code in the Excel workbook to start a session with the broker node and create an instance of Excel Client. The cluster administrator then deploys the workbook and any of its dependencies to the cluster.
When the user submits the first calculation request from Excel, Excel Client submits a job that includes a Service task. When the job starts running, Excel Client begins sending requests and receiving calculation results through the broker node. The session remains open until the user closes Excel or until the session times out.
On the cluster, the Service task runs on each allocated resource and invokes Excel Service. Excel Service starts Excel 2010 on the allocated nodes, loads the specified workbook, runs macro calculations (from the HPC macro framework), and returns the calculation results.
The following diagram illustrates how workbook offloading works on a cluster:
Accelerating Microsoft Excel 2010 with Windows HPC Server 2008 R2: Technical Overview (http://go.microsoft.com/fwlink/?LinkID=198461)
Windows HPC and Microsoft Excel Survival Guide on TechNet Wiki (http://go.microsoft.com/fwlink/?LinkID=198462)