Telephony Call Control with Microsoft Speech Server 2004

Article
08/28/2007

Introduction
Speech Server and Telephony Processing
Consultation Calls, Supervised Transfers, and Conferences
CSTA XML Flow for Consultation Calls, Supervised Transfers, and Conferences
Sample Code for Consultation Calls, Supervised Transfers and Conferences
Conclusion
Glossary
Appendix A – SupervisedTransfer Call Control Code

Introduction

Telephony call control is a critical part of developing speech-enabled or traditional IVR applications. In addition to enforcing the business logic and correctly handling any speech input and output, applications must interact with the phone system for basic actions such as answering, making, transferring, and disconnecting calls. In addition, some applications require more complex extended call control actions such as setting up three-way consultation calls in order to conduct supervised transfers or conference calls.

Microsoft^® Speech Server (MSS) 2004 and its development tool set, the Microsoft Speech Application Software Development Kit (SASDK), address both basic and extended call control. The SASDK includes basic call controls as well as extended call control capabilities via the Speech Application Language Tags (SALT) specification's Simple Messaging Extension (smex) element.

In this white paper we first describe the basic call controls available in the Microsoft Speech Application SDK, and then discuss more complex call scenarios to illustrate extended call control. We explain the underlying Computer Supported Telecommunications Applications (CSTA) XML-based communications used to set up consultation calls as well as the supervised transfers, conferences, and reconnections that rely on the consultation call service. Finally, we illustrate sample application code created using the SASDK to invoke these service requests and to handle the responses and events returned by the Telephony Interface Manager (TIM).

In this white paper we assume that the reader is somewhat familiar with Microsoft Speech Server 2004 and the SASDK. A glossary for many of the telephony terms used in this paper is provided at the end.

Speech Server and Telephony Processing

As shown in Figure 1 below, Speech Server is comprised of two main components, Speech Engines Services (SES) and Telephony Application Services (TAS). The TAS component of MSS interacts with the phone system through a middle layer called the Telephony Interface Manager (TIM).

Label, 1. MSS communicates with a telephone system through the Telephony Interface Manager

Label, 1. MSS communicates with a telephone system through the Telephony Interface Manager

The call control interaction between Speech Server-based applications and the TIM is done through the exchange of XML messages defined by the ECMA standard XML Protocol for Computer Supported Telecommunications Applications (CSTA) Phase III.¹ This standard, often referred to as Standard ECMA-323, specifies XML schema definitions for the CSTA Phase III Services Standard, ECMA-269, a set of services and events defined for computer to telephony communication. This computer to telephony communication is often referred to as CSTA communication. While speech application developers do not work directly with the Standard ECMA-323 CSTA messages, they need to understand the workings of these underlying messages used for call control requests, responses, and events so they can design and implement controls within the SASDK to generate and handle these CSTA messages.

In this section we introduce the basic communications between Speech Server-based applications and the underlying Telephony Interface Manager (TIM). We discuss the basic speech controls that can be implemented by developers using the SASDK² to incorporate basic and extended call control. We highlight the functionality of the SmexMessage control which can be used to create extended call control functionality. All of these controls, along with their event handlers, generate and process the CSTA messages for basic and extended call control. Finally, we discuss the level of call information and call transfer support found in common signaling protocols such as analog, T1 Channel Associated Signaling (CAS), and Integrated Services Digital Network (ISDN) Primary-Rate Interface (PRI).

Call Control Using CSTA and Smex

The Telephony Application Services (TAS) component of MSS runs speech-enabled Web applications containing both application and call control logic created by developers using the SASDK. While developers work directly with call control objects within the SASDK, under the covers CSTA XML messages serve as the communication link between TAS and the TIM. The SALT interpreter, a TAS component, establishes a communication conduit to the TIM for call control purposes. The SALT smex element is used as the wrapper for call control messages that pass back and forth along this simple communication channel. As mentioned earlier, the CSTA content of the XML request, response, and event messages are defined in Standard ECMA-323. Figure 2 shows how the speech application makes service requests, and how the TIM responds with service request responses and call control events.

Label 2. Call control communication between the application and the TIM

Label 2. Call control communication between the application and the TIM

Basic Call Controls

The ASP.NET Call Management Controls, provided in the SASDK, simplify the authoring of telephony applications and hide the potentially complicated CSTA interaction from the application developer by:

Wrapping specific client-side CSTA service requests
Receiving and processing call control events
Handling CSTA errors

The SASDK Call Management Controls include: AnswerCall, DisconnectCall, MakeCall, TransferCall, SmexMessage.

AnswerCall Control

This control answers incoming calls from a telephony device using the CSTA AnswerCall service, and stops dialogue flow until an incoming call is answered. It also provides access to information such as the calling party’s number and the dialed number. To support applications that require Computer Telephony Integration (CTI), the Established and Delivered events may contain CTI correlator data associated with the call. If it is supported by your TIM implementation the TIM places this information in the CorrelatorData property of the CallInfo object³.

DisconnectCall Control

This control disconnects a call using the CSTA ClearConnection service. In general, application authors should use the DisconnectCall control to end the phone call in a speech application. When this control completes its operation successfully, TAS terminates the dialogue and by default closes the application and resets itself without posting back to the server. To force a postback before the interpreter resets, application authors can either set the control's AutoPostback property to True, or call the SpeechCommon.Submit method in its OnClientDisconnected event handler.

The DisconnectCall control allows the application developer to disconnect only the original party on the call. This control cannot be used to drop the consulted call leg in a three-way conference call.

MakeCall Control

Using the CSTA MakeCall service, this control initiates an outbound telephone call to the specific number on the telephony device when activated by RunSpeech⁴. Further speech dialogue on the page is blocked until the call is either connected or fails to connect.

TransferCall Control

This control transfers the current active call using the CSTA SingleStepTransfer service. This provides blind transfer functionality only.

When RunSpeech runs this object, it blocks any further speech dialogue until the transfer either succeeds or fails. When the TransferCall control completes its operation successfully, TAS terminates the dialogue and by default closes the application and resets itself without posting back to the server. To force a postback before the interpreter resets, application authors can either set the control's AutoPostback property to True, or call the form.Submit method in its OnClientTransferred event handler.

The SmexMessage Control

While the basic call controls above provide the core call control functionality needed for telephony-based speech applications, the SmexMessage control is the key to advanced and extended call control capabilities. With it, a developer can create custom call management controls such as supervised transfers, conference calls, and more.

The SmexMessage control handles generic XML messages and events. By default, this control sends and receives XML messages to and from the TIM using a SALT smex element that is automatically generated by the control. The content of these messages must be CSTA XML messages. The SmexMessage control does not parse the content of messages it receives, and cannot determine when a received message indicates an error condition. When this control receives a message indicating an error condition, such as CSTA FailedEvent or CSTAErrorCode messages from the TIM, it raises the OnClientReceive event rather than the OnClientError event. The SmexMessage OnClientError event is triggered only when the client-side smex object raises its onerror event.

Call Control Processing in Speech Server-based Applications

While it's not necessary for developers to thoroughly understand the behind-the-scenes processing of controls in their speech applications, a high-level understanding can benefit them as they design, develop, and troubleshoot their applications.

At the point specified by the speech application (through the ClientActivationFunction property, or the internal completion flag being set to false) the dialog manager RunSpeech determines which speech call control should be activated. In the example shown in Figure 3 below, upon receiving a new smex event, RunSpeech checks if the original party has disconnected, and if not, it activates the SmexMessage control:

Label 3. RunSpeech interaction with SmexMessage control

Label 3. RunSpeech interaction with SmexMessage control

When the SmexMessage control is activated it performs the following steps.

Calls its OnClientBeforeSend property routine, if it is specified
Communicates its request message to the TIM
Waits for events and handles each one as follows:
- If the OnClientReceive routine returns True, the control stops waiting
- If the OnClientReceive routine returns False, the control waits for another event
- If the event is an error or time-out, the control stops waiting
- If the OnClientReceive property is not specified, the control stops waiting
Sets its internal completion flag to True

Control is then returned to RunSpeech to continue application processing.

Call Signaling Support

Before discussing extended call control in any detail, it is important to understand a little about the call information and call transfer capabilities available with various call signaling protocols. We’ll first discuss ANI and DNIS, two pieces of call information helpful for many extended call control scenarios. Then we’ll discuss methods of call transfer. Both application developers and administrators need to be aware of a signaling protocol’s capabilities as this could impact the desired functionality for the planned service.

Call Information and Signaling Protocols

When a call arrives at the server hosting the TAS and TIM the signaling layer may present information about the call to the application. The most important pieces of information are:

Automatic Number Identification (ANI): This typically represents the caller’s number (often termed caller ID). In the CSTA message and in the SASDK’s CallInfo object, the ANI is represented by the CallingDevice property. Knowledge of this information would enable an application to identify the caller and potentially provide tailored or premium services to the caller.
Dialed Number Information Service (DNIS): This typically represents the number that the originating caller dialed. In CSTA and in the SASDK’s CallInfo object, the DNIS is represented by the CalledDevice property. Knowledge of this information would enable MSS to associate different application services to different dialed numbers, even when the calls with different DNISs are routed to the same MSS computer.

As the following table shows, most of the common signaling protocols provide ANI and DNIS, but some do not.

Call Information Support

Signaling protocol	ANI	DNIS
Analog	Yes, however the analog line must be provisioned with caller ID. This should be done on the PBX, or by the carrier (if direct lines come from the carrier).	No. This is not available for analog lines.
T1 CAS	This may be available. In many cases, CAS does not provide any ANI information. However, some switches/PBXs can be configured to provide this information.	This may be available. In many cases, CAS does not provide any DNIS information. However, some switches/PBXs can be configured to provide this information.
ISDN PRI	Yes	Yes

Call Transfer Support

Similar to support for ANI and DNIS, different signaling protocols support transfers in a variety of ways. Each type of transfer may also require differing numbers of physical channels to enable the transfer.

Call transfer support is summarized in the following table. Because the TIM may not support all of these possibilities, please consult your TIM documentation or TIM vendor to understand the supported transfer types for your particular TIM implementation.

Call Transfer Support

Signaling protocol	Blind Transfer	Bridged consultation (supervised) transfer	Consultation (supervised) transfer
Analog	Yes. This is also known as a hook-flash transfer.	Yes. Ties up 2 channels during and after the transfer.	No
T1 CAS	Yes	Yes. Ties up 2 channels during and after the transfer.	Maybe. This is possible, but potentially difficult to configure on the PBX.
ISDN PRI	Maybe. AT&T T (or 8) transfer for 1-800 services. This may also be available with other carriers.	Yes. Ties up 2 channels during and after the transfer.	Maybe. Supported through 2B channel transfers (TBCT). See TBCT definition.

Consultation Calls, Supervised Transfers, and Conferences

To illustrate the use of extended call control, let’s look at a common scenario found in many call centers. Many traditional or speech-enabled IVR applications deployed in call centers require the ability to transfer calls and create conferences with multiple parties. Companies typically prefer supervised transfers (transfers where the transferring party maintains control over the transfer until complete) over blind transfers (transfers where the transferring party has no control over the transfer success) because it allows the transferring party to ensure a real person (or at least working voice mail) answers the call on the other end. Blind transfers can be frustrating to the initial caller.

In an MSS deployment, a blind transfer releases a call from the TAS and transfers the call irrespective of the state of the called party. The SALT interpreter is not involved in the call during or after the blind transfer. As previously mentioned, the basic TransferCall control provided with the SASDK provides blind transfer capability through the use of the CSTA SingleStepTransfer service. This type of transfer is commonly used within call centers where calls are transferred directly to a queue on the Automatic Call Distributor (ACD) or phone switch.

Unlike a blind transfer, a supervised transfer allows the Speech Server-based application to continue to monitor the call progress during the transfer. This enables the application to decide, for example, whether to place another call transfer to a different destination if the original transfer destination is busy or fails to answer.

The first step to supervised transfers and conferences is the consultation call. A supervised transfer is achieved through a CSTA Consultation Call service request and is completed with a Transfer Call service request. Similarly, a three-way conference call is also initiated through a Consultation Call service request and is completed with a Conference Call service request.

A consultation call is a compound service request that places the original call on hold and then (typically on a separate channel) issues an outbound call to the consulted party. Once an active connection is established with the consulted party, the Consultation Call service request completes. Generally, two physical channels are required for a consultation call: one for the held party and one for the consulted party. Figure 4 shows the connection states before a consultation call is issued and after it completes.

Label 4. Consultation call connection states

In Figure 4 above:

D1 is the consulting device. In this example, D1 is a speech application attached to a local TIM channel resource
D2 is the original calling party (device) into the MSS system
D3 is the consulted party (device), the target party for the transfer or conference

Before the consultation call is issued, the calling party (D2) is involved in an active call (C1) with the speech application, which is running inside TAS. During the consultation call, the original caller is placed on hold at the consulting device (D1), and a new call (C2) is initiated to the consulted device (D3). After the consultation call is connected, the consultation call service request completes, and the new call (C2) is in an active connected state between the consulting device (D1) and the consulted device (D3).

Following a consultation call, as shown in Figure 5 below, the application has three completion options: transfer, conference, or reconnect.

Label 5. Supervised transfers and conference calls with CSTA

In Figure 5 above:

TransferCall transfers the held party to the consulted party
ConferenceCall joins the held party, the consulted party, and the speech application in a three-way conference call. This service is commonly used for virtual assistant-type scenarios.
ReconnectCall is another compound service that disconnects the consulted party connection and retrieves the held party back into an active state (with the speech application)

Figure 5 also shows that in the case of a Failed Event, a ReconnectCall is also explicitly required to return the held call back to an active call state with the speech application. Some TIM implementations may automatically clear the consultation and retrieve the held call in the event of a Failed Event.

CSTA XML Flow for Consultation Calls, Supervised Transfers, and Conferences

This section describes generic call control using CSTA XML messages for service requests, responses, and events. Examples of CSTA XML flow are provided for consultation calls, transfer calls, conference calls, and reconnect calls.

CSTA for Consultation Calls

As previously mentioned, a consultation call is the first step for other extended call control functions like supervised transfers and conference calls. To create a consultation call, the speech application requests that the TIM put the calling party on hold and place an outbound call to the consulted party by requesting the CSTA <ConsultationCall> service.

Consultation Call Request

Speech application ? TIM

Note: Some parts of the following code snippet have been displayed in multiple lines only for better readability. These should be entered in a single line.

<ConsultationCall xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <existingCall>

        <callID>123</callID>

        <deviceID>0</deviceID>

    </existingCall>

    <consultedDevice>5551234</consultedDevice>

    <extensions>

        <privateData>

            <private xmlns:pri="https://schemas.microsoft.com/speech/

            2003/08/CSTAPrivateData">

                <pri:setOutboundCallingDevice>12345</pri:setOutboundCallingDevice>

                <pri:setCallAnalysis>true</pri:setCallAnalysis>

            </private>

        </privateData>

    </extensions>

</ConsultationCall>

Here the <existingCall> element represents the connectionID⁵ of the current call in progress (D1C1), while the <consultedDevice> element represents the device ID of the consulted party (D3). The above CSTA XML code puts the existing call on hold, then places an outbound phone call to the consulted device (that is, 555-1234).

The above <ConsultationCall> request also contains an example of optional extensions that the TIM may support⁶. In this example, through the private data extensions, the application specifies:

<setOutboundCallingDevice> sets the ANI (callingDevice) for the outbound consultation call leg. This is used when an application wants to set the caller ID to display on the consulted party’s phone, where the ANI would be set to the held party’s ANI (that is, the original callingDevice ID, D2⁷).
<setCallAnalysis> sets whether call analysis should be run by the TIM. Call analysis will attempt to determine what answers the call, whether it be a human, voicemail, a FAX machine, or another device. If this optional extension is missing, it is expected that TIMs will use whatever default (or platform configuration setting) is set.

In response to a successful <ConsultationCall> request, the TIM automatically sends the speech application a simple response as shown below:

Consultation Call Response

TIM ? Speech application

<ConsultionCallResponse xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <initiatedCall>

        <callID>127</callID>

        <deviceID>0</deviceID>

    </initiatedCall>

</ConsultionCallResponse>

The CSTA response above is a simple final acknowledgement of the success of the service request. Before this response is issued, however, other events (as discussed below) communicate the detailed progress of the request. The <ConsultationCallResponse> service response returns a new connectionID (represented by the <initiatedCall> element) for the outbound call to be placed to the consulted party (D1C2).

If the Consultation Call service request fails for some reason the response can be a CSTA error. For example, a failure can occur if the TIM has not been configured with any outbound channels, or if there are no free channels available to place the outbound call, as shown in the following example:

Error Response

TIM ? Speech application

<CSTAErrorCode xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <systemResourceAvailibility>resourceLimitExceeded</systemResourceAvailibility>

</CSTAErrorCode>

As part of the consultation request, before the consultation call is successfully completed, the TIM holds the existing call. Once the call is held, the TIM sends the speech application a <HeldEvent> event as shown below:

Held Event

TIM ? Speech application

<HeldEvent xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

   <monitorCrossRefID>1234</monitorCrossRefID>

   <heldConnection>

       <callID>123</callID>

       <deviceID>0</deviceID>    

   </heldConnection>

   <holdingDevice>

       <deviceIdentifier>0</deviceIdentifier>

   </holdingDevice>

   <cause>consultation</cause>

</HeldEvent>

This event indicates a successfully held call and references the previously existing call connectionID. As shown in the XML code above, the event also recognizes that the hold is due to a consultation request.

After the held event has been sent, the TIM places an outbound call to the consultedDevice as specified in the <ConsultationCall> request discussed earlier. A series of events (identical to those generated by a MakeCall service request) are generated:

Service Initiated (optional). A telephony service has been initiated at a monitored device (D1)
Originated (required). The calling device (associated with the speech application), D1, is connected to the call
Network Reached (required). The call has reached the network (Central Office (CO) line, trunk, etc)
Delivered (optional). The called device is alerting the called party. (This event occurs only if it is supported by the underlying signaling)
Established (required). The called party has answered the call.

The speech application is likely only interested in the <EstablishedEvent> and <FailedEvent> events. Shown below, the <EstablishedEvent> event is similar to the event received for an incoming call, with subtle differences in the event arguments:

Established Event

TIM ? Speech application

<EstablishedEvent xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <monitorCrossRefID>1234</monitorCrossRefID>

    <establishedConnection>

        <callID>127</callID>

        <deviceID>5551234</deviceID>

    </establishedConnection>

    <answeringDevice>

        <deviceIdentifier>5551234</deviceIdentifier>

    </answeringDevice>

    <callingDevice>

        <deviceIdentifier>0</deviceIdentifier>

    </callingDevice>

    <calledDevice>

        <deviceIdentifier>5551234</deviceIdentifier>

    </calledDevice>

    <lastRedirectionDevice>

        <notKnown/>

    </lastRedirectionDevice>

    <localConnectionInfo>connected</localConnectionInfo>

    <cause>networkSignal</cause>

    <associatedCalledDevice>

        <deviceIdentifier>27</deviceIdentifier>

    </associatedCalledDevice>

<extensions>

<privateData>

            <private xmlns:ca="https://schemas.microsoft.com/

            speech/2003/08/CSTAPrivateData">    

                <ca:callAnalysis>notKnown</ca:callAnalysis>

        </private>

</privateData>

    </extensions>

</EstablishedEvent>

In the above CSTA XML code <associatedCalledDevice> describes the channel on which the outbound call was made and is mandatory for outbound calls that are leaving the switching subdomain.

For outbound calls (a MakeCall), the TIM may detect what type of device is answering the call (for example, human, voicemail, FAX). In this case, this information is provided in the <extensions> field of the <EstablishedEvent⁸>. In the example above, the TIM is unable to identify the device and returns “notKnown” inside the CallAnalysis tag⁹.

Another event that the TIM could send to the speech application is the <FailedEvent> event (see Figure 5 above). This event contains the cause of the failure, an example of which is shown below:

Failed Event

TIM ? Speech application

<FailedEvent xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <monitorCrossRefID>1234</monitorCrossRefID>

    <failedConnection>

        <callID>130</callID>

        <deviceID>5551234</deviceID>

    </failedConnection>

    <failingDevice>

        <deviceIdentifier>5551234</deviceIdentifier>

    </failingDevice>

    <callingDevice>

        <deviceIdentifier>0</deviceIdentifier>

    </callingDevice>

    <calledDevice>

        <deviceIdentifier>5551234</deviceIdentifier>

    </calledDevice>

    <lastRedirectionDevice>

        <notKnown/>

    </lastRedirectionDevice>

    <localConnectionInfo>connected</localConnectionInfo>

    <cause>busy</cause>

    <associatedCalledDevice>

        <deviceIdentifier>27</deviceIdentifier>

    </associatedCalledDevice>

</FailedEvent>

In the above example you can see that the cause for the failed event was that the called device returned a busy signal. But, a variety of failures could have occurred; for example, either the held or consulted party disconnects, the call wasn’t answered, there aren’t any available ports on which to place the outbound call, and so on¹⁰.

Note: See ECMA Technical Report TR/82 Scenarios for Computer Supported Telecommunications Applications (CSTA) Phase III for consultation call example scenarios that require application support, such as:

Consultation call service negative acknowledgement (for example, no ports available on which to place the outbound consultation call)
Consulted party is busy
Consulted party disconnects
Held party disconnects

Media operations during consultation calls

Typically, a consultation call involves two physical channels: one for the original (held) call and one for the consultation (outbound) call. In Speech Server, the speech application can only listen to (speech and Dual Tone Multi-Frequency (DTMF) input) and play media on one channel at a time.

During a consultation call, the application can listen to and play media on the original call channel until the consulted party answers. At that point, any existing media operations attached to the original call channel are terminated, and any new listens or plays will be attached to the consultation call channel.

Far-end disconnect during consultation calls

At any time during the consultation call, the original (held) party may hang up. Likewise, after the consultation call completes, either the original (held) party or the consulted party may hang up. When this happens, the TIM sends a Connection Cleared event that indicates which connection leg has been disconnected. For example:

Connection Cleared Event

TIM ? Speech application

<ConnectionClearedEvent xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <monitorCrossRefID>1234</monitorCrossRefID>

    <droppedConnection>

        <callID>127</callID>

        <deviceID>0</deviceID>

    </droppedConnection>

    <releasingDevice>

        <deviceIdentifier>0</deviceIdentifier>

    </releasingDevice>

    <localConnectionInfo>null</localConnectionInfo>

    <cause>normalClearing</cause>

</ConnectionClearedEvent>

The above indicates that the local connection (D1C2 -- which TAS is monitoring) for the outbound consultation connection has been disconnected and is now in the null state, as represented in the <localConnectionInfo> element. TAS may also receive a <ConnectionClearedEvent> for the far-end disconnecting leg, D3C2.

The speech application may need to use a <ReconnectCall> to bring the held party (D2) back on to an active call with the speech application, but it is expected that the TIM will do this automatically¹¹. If D1C1 (the original party) disconnects, control is returned to RunsSpeech, and RunSpeech automatically handles clean up and resets the TAS session.

Now that we’ve seen how consultation calls work, let’s take a look at two other services that extend from it: supervised transfers and conference calls.

CSTA for Supervised Transfers

A supervised transfer allows the speech application to continue to monitor the call progress during the transfer. This enables the application to decide whether to place another call transfer to a different destination if the original transfer destination is busy. Let’s look at the call progression for a supervised transfer.

Label 6. Supervised transfer call connection states

Once the consultation call is successfully accomplished (the original party is held and the consulted party is connected, as shown in the Before section in Figure 6 above), the speech application can request the <TransferCall> service to connect the held call with the outbound (consulted) call using the connectionIDs for both the held call and the active call (as shown on the After section of Figure 6).

Transfer Call Request

Speech application ? TIM

<TransferCall xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <heldCall>

        <callID>123</callID>

        <deviceID>0</deviceID>

    </heldCall>

    <activeCall>

        <callID>127</callID>

        <deviceID>0</deviceID>

    </activeCall>

</TransferCall>

The TIM acknowledges the request with a response containing the transferred to connectionID (as expressed by combination of the <callID> and <deviceID> element values):

Transfer Call Response

TIM ? Speech application

<TransferCallResponse xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <transferredCall>

        <callID>128</callID>

        <deviceID>0</deviceID>

    </transferredCall>

</TransferCallResponse>

When the call has been successfully transferred, the TIM sends a <TransferredEvent> to the speech application. As shown below, this event also indicates that that the held connection and the consulted connection are now cleared:

Transferred Event

TIM ? Speech application

<TransferredEvent xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <monitorCrossRefID>1234</monitorCrossRefID>

    <primaryOldCall>

        <callID>123</callID>

        <deviceID>0</deviceID>

    </primaryOldCall>

    <secondaryOldCall>

        <callID>127</callID>

        <deviceID>0</deviceID>

    </secondaryOldCall>

    <transferringDevice>

        <deviceIdentifier>0</deviceIdentifier>

    </transferringDevice>

    <transferredToDevice>

        <deviceIdentifier>5551234</deviceIdentifier>

    </transferredToDevice>

    <transferredConnections>

        <connectionListItem>

            <oldConnection>

                <callID>123</callID>

                <deviceID>0</deviceID>

            </oldConnection>

        </connectionListItem>

    </transferredConnections>

    <localConnectionInfo>null</localConnectionInfo>

    <cause>transfer</cause>

</TransferredEvent>

Because the <TransferredEvent> event implies that the active connection has been cleared, there is no ConnectionCleared event for the active (consulted) connection. The speech application should be designed to treat the Transferred event as a ConnectionCleared event.

Note: If the TIM is configured to bridge (or hair-pin) the supervised transfer, then the transfer ties up two physical channel resources on the TIM. These channel resources only become free when the remote parties (original and consulted parties) disconnect, or when a configurable TIM defined-timer (for a bridged transfer) expires.

At this point, the speech application is neither attached to an active call nor is it a participant in the new connection between the original party and the consulted party, and thus it has no control over the call. To have the speech application remain on the call, the application developer should use the Conference Call service rather than the Transfer Call service.

CSTA for Conference Calls

As shown in Figure 5 previously, a conference call begins with a consultation call (just as with a supervised transfer). A three-way conference call is supported by bridging the consultation call. This section assumes that a (bridged) consultation call has completed as described above.

Label 7. Conference Call connection states

As shown in the After section of Figure 7, once the consultation call completes (the original party is held and the consulted party is connected), the speech application can request the <ConferenceCall> service to initiate a three-way conference call between the held party, the outbound (consulted) party. TAS uses the connectionIDs for both the held call and the active call. Once completed the speech application remains on the new call, listening to the original party as a silent participant.

The conference call request looks almost identical to the TransferCall request.

Conference Call Request

Speech application ? TIM

<ConferenceCall xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <heldCall>

        <callID>123</callID>

        <deviceID>0</deviceID>

    </heldCall>

    <activeCall>

        <callID>127</callID>

        <deviceID>0</deviceID>

    </activeCall>

</ConferenceCall>

And once again, there is a simple response.

Conference Call Response

TIM ? Speech application

<ConferenceCallResponse xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <conferenceCall>

        <callID>140</callID>

        <deviceID>0</deviceID>

    </conferenceCall>

</ConferenceCallResponse>

Once the conference has been completed, the TIM indicates this by sending the <ConferencedEvent> to the speech application. The event, shown below, contains many details about the old calls and lists the new connections on the conference call.

Conference Event

TIM ? Speech application

<ConferencedEvent xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <monitorCrossRefID>1234</monitorCrossRefID>

    <primaryOldCall>

        <callID>123</callID>

        <deviceID>0</deviceID>

    </primaryOldCall>

    <secondaryOldCall>

        <callID>127</callID>

        <deviceID>0</deviceID>

    </secondaryOldCall>

    <conferencingDevice>

        <deviceIdentifier>0</deviceIdentifier>

    </conferencingDevice>

    <addedParty>

        <deviceIdentifier>5551234</deviceIdentifier>

    </addedParty>

    <conferenceConnections>

        <connectionListItem>

             <newConnection>

                <callID>140</callID>

                <deviceID>0</deviceID>

            </newConnection>

        </connectionListItem>

        <connectionListItem>

             <newConnection>

                <callID>140</callID>

                <deviceID>14085551212</deviceID>

            </newConnection>

        </connectionListItem>

        <connectionListItem>

             <newConnection>

                <callID>140</callID>

                <deviceID>5551234</deviceID>

            </newConnection>

        </connectionListItem>

    </conferenceConnections>

<localConnectionInfo>connected</localConnectionInfo>

<cause>conference</cause>

</ConferencedEvent>

A new call is created, C3 (the callID of 140 in the example above), with the three devices or parties listed. The first is the TAS “channel,” followed by the original and the consulted parties.

Media operations during conference calls

As previously mentioned, in MSS the speech application can only listen to (speech and DTMF input) and play media on one channel at a time. TAS remains as a silent participant party on the new conference call, listening to audio input and DTMF input from the original calling party. This allows the speech application to listen to a key (hot) phrase or a DTMF sequence that could trigger the application to disconnect the consulted party call leg and revert to a two-way call between the original party and the speech application.

Detecting dropped connections on a conference call

Once the conference call is established, the original party or the consulted party could disconnect from the call. These possibilities are communicated to the speech application as a Connection Cleared event. In order for RunSpeech to automatically detect when the original caller disconnects (and take the appropriate action to clean up and call window.close() to reset the TAS session), the CallInfo object information must be updated once the ConferencedEvent is received, since the original caller connection is now D1C3, and not D1C1 (See Figure 7).

Dropping the consulted party on a conference call

The speech application can drop a remote party from the conference (thereby reverting it to a two-party call) by issuing a CSTA Clear Connection request that specifies the connection to be cleared. For example:

Clear Connection Request

Speech application ? TIM

<ClearConnection xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <connectionToBeCleared>

        <callID>140</callID>

        <deviceID>14085551212</deviceID>

    </connectionToBeCleared>

</ClearConnection>

Following this request, the application would receive the usual <ClearConnectionResponse> followed by a <ConnectionClearedEvent>. Note that this relies on the availability of remote party information.

The TIM implementation may instead require that the ClearConnection specify the channel on which the call is on, in order to clear it¹². For example:

Clear Connection Request

Speech application ? TIM

<ClearConnection xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <connectionToBeCleared>

        <callID>140</callID>

        <deviceID typeOfNumber=”deviceNumber">17</deviceID>

    </connectionToBeCleared>

</ClearConnection>

CSTA for Reconnect Calls

Reconnect is generally used when there was some problem in setting up the consultation call, for example, when the attempt to place the outbound call to the consulted party results in a <FailedEvent> event (see Figure 5). The entire process includes the request, response and then two events, one to signal that the call to the consulted device has been cleared and one to signal that the connection with the calling party has been reestablished.

Label 8. Reconnect call connection states

The <ReconnectCall> service can be used anytime after receiving the <ConsultionCallResponse> event to clear the active (consulted) connection and retrieve the held (original) connection, as shown in Figure 8. The sample CSTA request and response follows:

Reconnect Call Request

Speech application ? TIM

<ReconnectCall xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <activeCall>

        <callID>127</callID>

        <deviceID>0</deviceID>

    </activeCall>

    <heldCall>

        <callID>123</callID>

        <deviceID>0</deviceID>

    </heldCall>

</ReconnectCall>

<activeCall> represents the active (consulted) call connection (D1C2)
<heldCall> represents the held or original call connection (D1C1)

The TIM then provides a simple response back to the application:

Reconnect Call Response

TIM ? Speech application

<ReconnectCallResponse xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2"/>

Following a successful request to reconnect the call, and once the D1C2 consulted connection has been cleared, the TIM sends a <ConnnectionClearedEvent>, shown below, to the speech application.

Connection Cleared Event

TIM ? Speech application

<ConnectionClearedEvent xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <monitorCrossRefID>1234</monitorCrossRefID>

    <droppedConnection>

        <callID>127</callID>

        <deviceID>0</deviceID>

    </droppedConnection>

    <releasingDevice>

        <deviceIdentifier>0</deviceIdentifier>

    </releasingDevice>

    <localConnectionInfo>null</localConnectionInfo>

    <cause>normalClearing</cause>

</ConnectionClearedEvent>

In the above example:

droppedConnection contains the connection identifier of the connection that has been cleared (D1C2)
releasingDevice provides the endpoint identifier of the releasing device (D1)
localConnectionState indicates the connection state of “null” at the near-end connection. This is typically the connection that is associated with the monitored device for TAS

The TIM then automatically retrieves the held call into an active state and sends a <RetrievedEvent> to the speech application:

Retrieved Event

TIM ? Speech application

<RetrievedEvent xmlns="https://www.ecma.ch/standards/ecma-323/csta/ed2">

    <monitorCrossRefID>1234</monitorCrossRefID>

    <retrievedConnection>

        <callID>123</callID>

        <deviceID>0</deviceID>

    </retrievedConnection>

    <retrievingDevice>

        <deviceIdentifier>0</deviceIdentifier>

    </retrievingDevice>

    <localConnectionInfo>connected</localConnectionInfo>

    <cause>normal</cause>

</RetrievedEvent>

At this point, the original connection (D1C1) is returned to the active state and the dialogue interaction, as scripted by the speech application, can resume with the original party (D2).

Sample Code for Consultation Calls, Supervised Transfers and Conferences

Now that we have seen the underlying communications messages, let’s look at some sample code that might be written by a developer using the SASDK to create the CSTA requests and handle the CSTA events shown above.

Supervised Transfer Call Control

Each of the following three custom controls reference another custom control called the SupervisedTransferControl (the code for which is shown in Appendix A). The SupervisedTransferControl implements a supervised transfer, and returns event and error codes back to the calling function. The following code segment demonstrates its use on an ASP.NET (.aspx) page:

<STC:SupervisedTransferControl id="SupervisedTransferControl1" runat="server"

    TransferToNum="5551234"

    ClientActivationFunction="SupervisedTransferControl1_ClientActivationFunction"

    OnClientFailure="SupervisedTransferControl1_OnClientFailure"

    OnClientTransfered="SupervisedTransferControl1_OnClientTransfered"/>

For more information about this custom call control, see “How do I build a custom control for more advanced call control functions?” in the Microsoft Speech Tips and Tricks at https://www.microsoft.com/speech/techinfo/tipsandtricks.

ConsultationMessage Control

The following code sample defines a new SmexMessage control and some event handlers to hook in to this control to send the ConsultationCall request and to process the CSTA events that the TIM returns.

Note: Some parts of the following code snippet have been displayed in multiple lines only for better readability. These should be entered in a single line.

<speech:SmexMessage id="ConsultationMessage" runat="server"

    OnClientBeforeSend="ConsultationMessage_OnClientBeforeSend"

    OnClientError="ConsultationMessage_OnClientError"

    OnClientReceive="ConsultationMessage_OnClientReceive"

    ClientActivationFunction="ConsultationMessage_ClientActivationFunction">

</speech:SmexMessage>



function ConsultationMessage_ClientActivationFunction() {

    return (SupervisedTransferControl_CurrentStage ==

    "ConsultationMessage");

}



function ConsultationMessage_OnClientBeforeSend() {

    if (SupervisedTransferControl_TransferToNum == 0) 

    {

        throw("TransferToNum must be specified");

    }

    var numberToDial = SupervisedTransferControl_TransferToNum.replace

    (/[^\d]/g, "");



    var callID = RunSpeech.CurrentCall().Get("CallID");

    var deviceID = RunSpeech.CurrentCall().Get("DeviceID");

    var outboundCallingDevice = SupervisedTransferControl_CallerIDNum;

    if (outboundCallingDevice == "")

        outboundCallingDevice = "123456";



    return(

    "<ConsultationCall xmlns=\"https://www.ecma.ch/standards/ecma-

    323/csta/ed2\">" + 

    "    <existingCall>" + 

    "        <callID>" + callID + "</callID>" + 

            <deviceID>" + deviceID + "</deviceID>" + 

    "    </existingCall>" + 

    "    <consultedDevice>" + numberToDial + "</consultedDevice>" + 

    "    <extensions>" + 

    "        <privateData>" +

    "          <private xmlns:pri=\"https://schemas.microsoft.com/
    "          speech/2003/08/CSTAPrivateData\">" +

    "

    <pri:setOutboundCallingDevice>" + outboundCallingDevice + "</pri:setOutboundCallingDevice>" +

    "        

    <pri:setCallAnalysis>false</pri:setCallAnalysis>" +

    "            </private>" +

    "        </privateData>" +

    "    </extensions>" +

    "</ConsultationCall>"

    );

}

function ConsultationMessage_OnClientError() {

    SupervisedTransferControl_FailureCause = "SmexError";

    SupervisedTransferControl_CurrentStage = "ReconnectMessage";

}



function ConsultationMessage_OnClientReceive(smexObj, docEl) {

    switch(docEl.baseName) {

        case "ConsultationCallResponse":

            // Store the new call ID for use in either the Transfer or the

            Reconnect messages.

            SupervisedTransferControl_NewCallId =getNodeText

           (docEl.selectSingleNode

           ("/csta:ConsultationCallResponse/csta:initiatedCall/csta:callID"));break;



 
    case "HeldEvent":

        // We set a flag saying that the call has been held, so that we 

        know

        // if it makes sense to attempt a reconnect following a failure.

        SupervisedTransferControl_CallWasHeld = true;

        break;

    case "DeliveredEvent":

        break;

    case "NetworkReachedEvent":

        break;



 
    case "EstablishedEvent":

        // EstablishedEvent means we have successfully connected to the

        // new number and can attempt to Transfer.

        SupervisedTransferControl_CurrentStage = "TransferMessage";

 
        return(true);

        break;

    // A FailedEvent or CSTAErrorCode both mean we've failed for one

    // reason or another to reach the new party

    case "FailedEvent":

            SupervisedTransferControl_FailureCause =getNodeText 

            (docEl.selectSingleNode("/csta:FailedEvent/csta:cause"));

            SupervisedTransferControl_CurrentStage = "ReconnectMessage";

            return(true);

            break;

        

    case "CSTAErrorCode":

        SupervisedTransferControl_FailureCause = getNodeText

        (docEl.selectSingleNode("/csta:CSTAErrorCode"));

        SupervisedTransferControl_CurrentStage = "ReconnectMessage";

        return(true);

        break;

    default:

        break;

}

return(false);

}

The ConsultationMessage control is activated when the application initiates the supervised transfer. The ConsultationMessage control sends the CSTA ConsultationCall message and waits to see whether the consultation call is successful. The ConsultationCall places the original call on hold (now the held party), and, typically on a separate channel, places an outbound call to the consulted party.

Success: Receipt of a CSTA EstablishedEvent (the consulted party has answered)
Failure: Either a CSTAErrorCode (for example, no channel resources available for the transfer) or CSTA FailedEvent (for example, the consulted party is busy or the phone “rings no answer”)

If necessary, when the consulted party disconnects, the application can take special action (such as a post-back) by attaching a handler directly to the SMEX element used for call control, intercepting the events before they reach RunSpeech. (This is required because RunSpeech does not currently handle any party other than the original party disconnect properly.) The following sample shows how this might be done:

// this gets the id of the smex element, not the SmexMessage Control

var theSmexEl = document.all["_callControlSmex"];

var oldOnReceive = theSmexEl.onreceive;

theSmexEl.onreceive = myNewHandler;

function myNewHandler() {

    // write a normal smex onreceive handler 

}

The application would now have to handle all the incoming messages and pass them through to oldOnReceive if they are not handled by the new (intercepting) handler. The application would have to do this on every post-back.

TransferMessage Control

The following code sample defines a new SmexMessage control and some event handlers to hook in to this control to send the TransferCall request and process the CSTA events that the TIM returns.

Note: Some parts of the following code snippet have been displayed in multiple lines only for better readability. These should be entered in a single line.

<speech:SmexMessage id="TransferMessage" runat="server"
     OnClientBeforeSend="TransferMessage_OnClientBeforeSend"

     OnClientError="TransferMessage_OnClientError"     OnClientReceive="TransferMessage_OnClientReceive"

     ClientActivationFunction="TransferMessage_ClientActivationFunction">

</speech:SmexMessage>




function TransferMessage_ClientActivationFunction() {

    return (SupervisedTransferControl_CurrentStage == "TransferMessage");

}




function TransferMessage_OnClientBeforeSend() {

    var callID = RunSpeech.CurrentCall().Get("CallID");

    var deviceID = RunSpeech.CurrentCall().Get("DeviceID");




    return(


    "<TransferCall xmlns=\"https://www.ecma.ch/
    standards/ecma-323/csta/ed2\">"+

    "    <heldCall>" + 

    "        <callID>" + callID + "</callID>" +

    "        <deviceID>" + deviceID + "</deviceID>" + 

    "    </heldCall>" + 

    "    <activeCall>" + 

    "      <callID>" + SupervisedTransferControl_NewCallId + "</callID>" +

    "      <deviceID>" + deviceID + "</deviceID>" +

    "    </activeCall>" + 

    "</TransferCall>");

}




function TransferMessage_OnClientError() {

    SupervisedTransferControl_FailureCause = "SmexError";

    SupervisedTransferControl_CurrentStage = "ReconnectMessage";

}

function TransferMessage_OnClientReceive(smexObj, docEl) {

   switch(docEl.baseName) {

       case "TransferCallResponse":

           break;




        case "TransferedEvent":

           // The TransferedEvent (note the spelling with a single 'r') 

           // indicates the two parties are connected; upon receiving it 

           // we can call our success handler.

           SupervisedTransferControl_CurrentStage = "Complete";

           if (SupervisedTransferControl_OnClientTransfered != null) {


                SupervisedTransferControl_OnClientTransfered();

           }

           return(true);

           break;




       // A FailedEvent or CSTAErrorCode both mean we've failed for one

       // reason or another to complete the transfer and we should

       // attempt to reconnect the original (held) call.

       case "FailedEvent":

           SupervisedTransferControl_FailureCause = getNodeText

           (docEl.selectSingleNode("/csta:FailedEvent/csta:cause"));

           SupervisedTransferControl_CurrentStage = "ReconnectMessage";

           return(true);

           break;




        case "CSTAErrorCode":

           SupervisedTransferControl_FailureCause = getNodeText

           (docEl.selectSingleNode("/csta:CSTAErrorCode"));

           SupervisedTransferControl_CurrentStage = "ReconnectMessage";

           return(true);

           break;




        default:

           break;

   }

   return(false);

}

If the consultation was successful, the TransferMessage control is activated. This control joins the original (held) party to the consulted party. The control sends the CSTA TransferCall message and waits to see whether the transfer call is successful.

Success: Receipt of a CSTA TransferedEvent (notice that the CSTA specification has spelled this message with one ‘r’ in the word transferred; for consistency, this user control maintains that spelling throughout)
Failure: Either a CSTAErrorCode or CSTA FailedEvent

ReconnectMessage Control

The following code sample defines a new SmexMessage control and some event handlers to hook in to this control to send the ReconnectCall request and process the CSTA events that the TIM returns.

Note: Some parts of the following code snippet have been displayed in multiple lines only for better readability. These should be entered in a single line.

<speech:SmexMessage id="ReconnectMessage" runat="server"     OnClientBeforeSend="ReconnectMessage_OnClientBeforeSend"
    OnClientError="ReconnectMessage_OnClientError"     OnClientReceive="ReconnectMessage_OnClientReceive"
    ClientActivationFunction="ReconnectMessage_ClientActivationFunction">
</speech:SmexMessage>




function ReconnectMessage_ClientActivationFunction() {

   if (SupervisedTransferControl_CurrentStage == "ReconnectMessage") {

       SupervisedTransferControl_CurrentStage = "Complete";




       // If the call was never Held, there is not point issuing a 

       // ReconnectMessage, so we can call the failure handler

       // immediately

       if (SupervisedTransferControl_CallWasHeld == true) {

           return(true);

       } else {



    SupervisedTransferControl_OnClientFailure

   (SupervisedTransferControl_FailureCause);

           return(false);

       }

   } else {

       return(false);

   }

}




function ReconnectMessage_OnClientBeforeSend() {

   var callID = RunSpeech.CurrentCall().Get("CallID");

   var deviceID = RunSpeech.CurrentCall().Get("DeviceID");




    return(


   "<ReconnectCall xmlns=\"https://www.ecma.ch/

    standards/ecma-323/csta/ed2\">"+ 

   "    <heldCall>" + 

   "        <callID>" + callID + "</callID>" + 

   "        <deviceID>" + deviceID + "</deviceID>" + 

   "    </heldCall>" + 


   "    <activeCall>" + 


   "      <callID>" + SupervisedTransferControl_NewCallId + "</callID>" +

   "      <deviceID>" + deviceID + "</deviceID>" + 

   "    </activeCall>" + 

   "</ReconnectCall>");

}




function ReconnectMessage_OnClientError() {


    SupervisedTransferControl_FailureCause = "SmexError";

   SupervisedTransferControl_OnClientFailure

   (SupervisedTransferControl_FailureCause);

   window.close();

}




function ReconnectMessage_OnClientReceive(smexObj, docEl) {

   switch(docEl.baseName) {

       case "ReconnectCallResponse":

           break;




        case "RetrievedEvent":

           // If we reach RetrievedEvent, we've recovered from whatever 

           // went wrong and we can call the failure handler and then 

           // return control to the 'calling' page.




    SupervisedTransferControl_OnClientFailure

   (SupervisedTransferControl_FailureCause);

           SupervisedTransferControl_CurrentStage = "Complete";

           return(true);

           break;




       // A FailedEvent or CSTAErrorCode both mean we've failed to even 

       // reconnect the held call, which is essentially fatal, and we 

       // might as well just do a window.close to free up resources.

       case "FailedEvent":




    SupervisedTransferControl_OnClientFailure

   (SupervisedTransferControl_FailureCause);

           SupervisedTransferControl_CurrentStage = "Complete";

           window.close();

           return(true);


           break;




        case "CSTAErrorCode":




    SupervisedTransferControl_OnClientFailure

   (SupervisedTransferControl_FailureCause);

           SupervisedTransferControl_CurrentStage = "Complete";

           window.close();

           return(true);

           break;




        default:

           break;

   }

   return(false);

}

If the consultation or transfer fails, the ReconnectMessage control is activated. This attempts to disconnect the consulted party and bring the original (held) party back into an active call state with the speech application. The control sends the CSTA ReconnectCall message and waits to see whether the reconnect is successful.

Success: Receipt of a CSTA RetrievedEvent
Failure: Either a CSTAErrorCode or FailedEvent

Conference Call Sample Code

Developers can create conference calls in a similar fashion. A ConferenceMessage control built with a SmexMessage control can be built based on the TransferMessage control, and should borrow from the CSTA conference call semantics outlined above.

Conference calls can be initiated through the following piece of client code, extracting the new call ID from the <ConferencedEvent> and setting the current call’s callID to this value:

myNewConferenceCallId = cstaMessageDomObject. selectSingleNode
("/csta:ConferencedEvent/csta:conferenceConnections/

csta:connectionListItem/csta:newConnection/csta:callID").text;

RunSpeech.CurrentCall().Set("CallID", myNewConferenceCallId);

If the consulted party (D3) disconnects, RunSpeech ignores this as well as any Connection Cleared event that does not match the current call’s connection ID. The TAS session and the original party remain on the call.

Using the above snippets of sample code as a starting point, developers can build into their Speech Server-based applications a significant amount of the sophisticated extended call control functionality they are likely to need.

Conclusion

Microsoft Speech Server 2004 is a powerful speech-enabled IVR platform capable of sophisticated speech and telephony processing. By virtue of Speech Server's support for the Speech Application Language Tags (SALT) specification and the SALT Simple Messaging EXtension (smex) element Speech Server’s telephony call control capabilities allow a developer to create sophisticated telephony-based speech applications that can exploit basic call control services, using the included basic call controls, as well as extended call control services, by creating custom call controls that harness the power of CSTA.

Glossary

Term	Definition
Consultation Calls	A consultation call (in CSTA terms) places the original calling party on hold, and places a new call to the consulted party. This is used as the first step in a supervised transfer. Depending on the signaling protocol used, the switch, and the TIM capabilities, consultation calls may use one or two channels (though two are typically used).
CSTA	Computer Supported Telecommunications Applications (CSTA), as defined in ECMA-269 and ECMA-323, is used by the Microsoft® Speech Server platform and the TIM for call control purposes. CSTA examples for richer call control functions than are supported by the SDK speech call controls are provided in this white paper. For more detailed information on CSTA, please consult Standard ECMA-269 (Services for Computer Supported Telecommunications Applications Phase III) and Standard ECMA-323 (XML Protocol for Computer Supported Telecommunications Applications Phase III).
ISDN	Integrated Services Digital Network. This comes in two service types: primary rate information (PRI) and basic rate information (BRI). In ISDN voice calls, call signaling and synchronization is carried on (D) channels reserved for data, while voice/audio travels on bearer (B) channels. Popular ISDN PRI signaling protocols inside the US are NI2, 4ESS, and 5ESS.
SALT	Speech Application Language Tags. A markup language extension that integrates speech services into existing markup languages such as HTML and XHTML. SALT consists of a set of XML elements, with associated attributes and Document Object Model (DOM) object properties, events and methods. Enables multimodal and telephony access to information and applications from PCs, telephones, and PDAs.
SMEX	Defined by the SALT specification, Simple Message Exchange (SMEX) provides a simple communications channel (or pipe) between the SALT application and another component, residing locally or remotely. The Microsoft® Speech Server (MSS) uses SMEX to pass and receive CSTA call control service requests and events to and from a TIM.
T1	A digital transmission link with a total signaling speed of 1.544 Mbps, and is standard for digital transmission in North America. (Also known as a trunk, span, or digital signaling level 1 (DS1).)
T1 CAS	Channel Associated Signaling: Each channel from a digital T1 (or E1) trunk carries call and synchronization signaling data in-band. This is often also termed in-band or “robbed-bit” signaling. Often provided using Foreign eXchange Signaling (FXS).
Transfers: blind transfer	Once a blind transfer is initiated, the SALT application is no longer an active participant in the call and does not receive any call progress information once the transfer is requested.
Transfers: bridged	A bridged transfer relies on consuming two physical TIM channel resources for the call transfer for the duration of the transfer. This type of channel utilization for transfers is also known as a hairpin or trombone transfer. While this has the drawback of consuming expensive TIM resources, it does have the advantage of being signaling agnostic. The three-way conference call described in “Supervised Transfers and Conferences” can only be supported by bridging the call on the TIM.
Transfers: ISDN 2B channel transfer	Two B-channel transfers (TBCT) takes two active calls (one inbound and one outbound, such as on a consultation) and joins these calls at the switch, thus freeing up the channel resources on the TIM once the TBCT completes. This is only supported on ISDN. Administrators should check whether this feature is supported by the TIM, and by the PBX or the telecom carrier’s switch.
Transfers: supervised transfer	During a supervised transfer, the SALT application remains as an active participant in the call and receives call transfer progress information once the transfer is requested. When the transfer completes, the SALT application typically drops out of the call. This is also commonly referred to as an attended transfer. The SALT application may remain on the call as a silent participant for Microsoft® Speech Server through use of a three-way conference call (if supported by the underlying TIM), as described in “CSTA for conference calls.”

Appendix A – SupervisedTransfer Call Control Code

Sample code for a SupervisedTransfer call control as described in “How do I build a custom control for more advanced call control functions?” in the Microsoft Speech Tips and Tricks at https://www.microsoft.com/speech/techinfo/tipsandtricks:

Note: Some parts of the following code snippet have been displayed in multiple lines only for better readability. These should be entered in a single line.

<%@ Register TagPrefix="speech" Namespace="Microsoft.Speech.Web.UI" Assembly="Microsoft.Speech.Web, Version=1.0.3200.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" %>




<script runat="server" language="C#">

// <![CDATA[




// In the server side code, we expose the properties of this user control

// that can be set when using it.




    // Provide a client-side function to call if the transfer fails. This 

    // function will be passed an argument that contains the cause

    // (e.g., busy, noAnswer) of the failure.

    public string OnClientFailure 

    {

        get { return(onClientFailure); }

        set { onClientFailure = value; }

    }




    // Provide a client-side function to call if the transfer succeeds
    public string OnClientTransfered 

    {

        get { return(onClientTransfered); }

        set { onClientTransfered = value; }

    }




    // Provide the phone number to which you want to transfer

    public string TransferToNum 

    {

        get { return(transferToNum); }

        set { transferToNum = value; }

    }




    // Provide the phone number that you want to appear as the 'caller ID'.

    // Note that this is done using a private data field supported by the
    // the Intel TIM. Other TIMs might require different methods of handling 

    // this.

    public string CallerIDNum 

    {

        get { return(callerIDNum); }

        set { callerIDNum = value; }

    }




    // Provide the client activation function for the transfer. This function

    // will be called once, and if true the transfer will be attempted.

    public string ClientActivationFunction 

    {

        get { return(clientActivationFunction); }

        set { clientActivationFunction = value; }

    }

    

    private string clientActivationFunction = String.Empty;

    private string onClientFailure = String.Empty;

    private string onClientTransfered = String.Empty;

    private string transferToNum = String.Empty;

    private string callerIDNum = String.Empty;




    private void Page_Load(object sender, System.EventArgs e)

    {

        if (this.ClientActivationFunction == null || 

        this.ClientActivationFunction.Length == 0) 

        {

            throw new ArgumentNullException("ClientActivationFunction", 

            "ClientActivationFunction must be specified");

        }




        if (this.OnClientFailure == null || this.OnClientFailure.Length == 0) 

        {

            throw new ArgumentNullException("OnClientFailure", "OnClientFailure must be specified");

        }




        // The properties that are expected to be dynamic (the numbers we are

        // calling from and to) are rendered to hidden text fields, to avoid

        // causing the rendered script of the page to change

        // (which would make it uncacheable).

        Page.RegisterHiddenField("hfSupervisedTransferControl_TransferToNum", 

        this.transferToNum);

        Page.RegisterHiddenField("hfSupervisedTransferControl_CallerIDNum", 

        this.callerIDNum);




        // Other properties (functions and handlers) are rendered directly

        // as variables, because this avoids having to use 'eval' to access

        //  the function pointers and because they are not expected to change

        // (within a single usage of the user control).

        StringBuilder s = new StringBuilder();

        s.Append("<script>\r\n");

        s.Append(String.Format("var SupervisedTransferControl_CurrentStage = 

        {0}() ? \"ConsultationMessage\" : \"Complete\";\r\n",         this.clientActivationFunction));

        s.Append(String.Format("var SupervisedTransferControl_OnClientFailure

         ={0};\r\n", this.onClientFailure));

        s.Append(String.Format("var         SupervisedTransferControl_OnClientTransfered = {0};\r\n", 

            this.onClientTransfered.Length > 0 ? this.onClientTransfered :             "null"));




        s.Append("</sc" + "ript>\r\n"); // break up the tag to avoid looking         like XML

        Page.RegisterStartupScript("SupervisedTransferControl_StartupScript",         s.ToString());

        DataBind();

    }

// ]]>

</script>




<speech:SmexMessage id="ConsultationMessage" runat="server" OnClientBeforeSend="ConsultationMessage_OnClientBeforeSend"

    OnClientError="ConsultationMessage_OnClientError" OnClientReceive="ConsultationMessage_OnClientReceive"

    ClientActivationFunction="ConsultationMessage_ClientActivationFunction">

    </speech:SmexMessage>

<br>

<speech:SmexMessage id="TransferMessage" runat="server" OnClientBeforeSend="TransferMessage_OnClientBeforeSend"

    OnClientError="TransferMessage_OnClientError" OnClientReceive="TransferMessage_OnClientReceive"

    ClientActivationFunction="TransferMessage_ClientActivationFunction">

    </speech:SmexMessage>
<br>

<speech:SmexMessage id="ReconnectMessage" runat="server" OnClientBeforeSend="ReconnectMessage_OnClientBeforeSend"

    OnClientError="ReconnectMessage_OnClientError" OnClientReceive="ReconnectMessage_OnClientReceive"

    ClientActivationFunction="ReconnectMessage_ClientActivationFunction">

    </speech:SmexMessage>

<br>




<script>

// <![CDATA[




// Use global variables to hold state. Note that (for simplicity) no attempt is made 

// to avoid collisions with existing variables with the same names. 

//

// NB: this control has not been made robust to post-back (e.g., in any of the handlers)




var SupervisedTransferControl_NewCallId = null;
var SupervisedTransferControl_FailureCause = null;

var SupervisedTransferControl_CallWasHeld = false;




var SupervisedTransferControl_TransferToNum = document.all["hfSupervisedTransferControl_TransferToNum"].value;

var SupervisedTransferControl_CallerIDNum = document.all["hfSupervisedTransferControl_CallerIDNum"].value;




function getNodeText(theNode) {

  if (theNode == null) return "";

  return(theNode.text);

}




// Each of the three SmexMessage controls has four methods defined: onClientBeforeSend, 

// onClientError, onClientReceived, and clientActivationFunction.




//

// ConsultationMessage

//
function ConsultationMessage_ClientActivationFunction() {

    return (SupervisedTransferControl_CurrentStage == "ConsultationMessage");

}




function ConsultationMessage_OnClientBeforeSend() {

    if (SupervisedTransferControl_TransferToNum == 0) 

    {

        throw("TransferToNum must be specified");

    }

    var numberToDial = SupervisedTransferControl_TransferToNum.replace(/

    [^\d]/g, "");




    var callID = RunSpeech.CurrentCall().Get("CallID");

    var deviceID = RunSpeech.CurrentCall().Get("DeviceID");

    var outboundCallingDevice = SupervisedTransferControl_CallerIDNum;

    if (outboundCallingDevice == "")

        outboundCallingDevice = "123456";

 
  return(

    "<ConsultationCall xmlns=\"https://www.ecma.ch/standards/ecma-323/csta/ed2     \">" + 

    "    <existingCall>" + 

    "        <callID>" + callID + "</callID>" + 

    "        <deviceID>" + deviceID + "</deviceID>" + 

    "    </existingCall>" + 

    "    <consultedDevice>" + numberToDial + "</consultedDevice>" + 

    "    <extensions>" + 

    "        <privateData>" +

    "            <private xmlns:pri=\"https://schemas.microsoft.com/

    "            speech/2003/08/CSTAPrivateData\">"+

    "

    "                <pri:setOutboundCallingDevice>" + outboundCallingDevice + 


    "                </pri:setOutboundCallingDevice>" +

    "

    "                <pri:setCallAnalysis>false</pri:setCallAnalysis>" +

    "            </private>" +

    "        </privateData>" +

    "    </extensions>" +

    "</ConsultationCall>"

    );

}

function ConsultationMessage_OnClientError() {

    SupervisedTransferControl_FailureCause = "SmexError";

    SupervisedTransferControl_CurrentStage = "ReconnectMessage";

}

function ConsultationMessage_OnClientReceive(smexObj, docEl) {

    switch(docEl.baseName) {

        case "ConsultationCallResponse":

            // Store the new call ID for use in either the Transfer or the 

            Reconnect messages.

            SupervisedTransferControl_NewCallId = getNodeText

            (docEl.selectSingleNode

            ("/csta:ConsultationCallResponse/csta:initiatedCall/csta:callID"));

            break;

        case "HeldEvent":

           // We set a flag saying that the call has been held, so that we 

           know

           // if it makes sense to attempt a reconnect following a failure.

           SupervisedTransferControl_CallWasHeld = true;

           break;

        case "DeliveredEvent":

            break;

        case "NetworkReachedEvent":

            break;

        case "EstablishedEvent":

           // EstablishedEvent means we have successfully connected to the

           // new number and can attempt to Transfer.

           SupervisedTransferControl_CurrentStage = "TransferMessage";

           return(true);

           break;

        // A FailedEvent or CSTAErrorCode both mean we've failed for one

       // reason or another to reach the new party

       case "FailedEvent":

           SupervisedTransferControl_FailureCause = getNodeText

           (docEl.selectSingleNode("/csta:FailedEvent/csta:cause"));

           SupervisedTransferControl_CurrentStage = "ReconnectMessage";

           return(true);

           break;

        case "CSTAErrorCode":

           SupervisedTransferControl_FailureCause = getNodeText

           (docEl.selectSingleNode("/csta:CSTAErrorCode"));

           SupervisedTransferControl_CurrentStage = "ReconnectMessage";

           return(true);

           break;

        default:

           break;

   }

   return(false);

}

//

// TransferMessage

//

function TransferMessage_ClientActivationFunction() {

    return (SupervisedTransferControl_CurrentStage == "TransferMessage");

}

function TransferMessage_OnClientBeforeSend() {

    var callID = RunSpeech.CurrentCall().Get("CallID");

    var deviceID = RunSpeech.CurrentCall().Get("DeviceID");

    return(

    "<TransferCall xmlns=\"https://www.ecma.ch/standards/ecma-323/csta/ed2\">"+

    "    <heldCall>" + 

    "        <callID>" + callID + "</callID>" +

    "        <deviceID>" + deviceID + "</deviceID>" + 

    "    </heldCall>" + 

    "    <activeCall>" + 

    "        <callID>" + SupervisedTransferControl_NewCallId + "</callID>" + 

    "        <deviceID>" + deviceID + "</deviceID>" + 

    "    </activeCall>" + 

    "</TransferCall>");

}

function TransferMessage_OnClientError() {

    SupervisedTransferControl_FailureCause = "SmexError";

    SupervisedTransferControl_CurrentStage = "ReconnectMessage";

}

function TransferMessage_OnClientReceive(smexObj, docEl) {

   switch(docEl.baseName) {

       case "TransferCallResponse":

           break;

        case "TransferedEvent":

           // The TransferedEvent (note the spelling with a single 'r') 

           indicates the two

           // parties are connected; upon receiving it we can call our success             handler.

           SupervisedTransferControl_CurrentStage = "Complete";

           if (SupervisedTransferControl_OnClientTransfered != null) {

               SupervisedTransferControl_OnClientTransfered();

           }

           return(true);

           break;

        // A FailedEvent or CSTAErrorCode both mean we've failed for one
        // reason or another to complete the transfer and we should attempt
        // to reconnect the original (held) call.

       case "FailedEvent":

           SupervisedTransferControl_FailureCause = getNodeText

           (docEl.selectSingleNode("/csta:FailedEvent/csta:cause"));

           SupervisedTransferControl_CurrentStage = "ReconnectMessage";

           return(true);

           break;

        case "CSTAErrorCode":

          SupervisedTransferControl_FailureCause = getNodeText

          (docEl.selectSingleNode("/csta:CSTAErrorCode"));

          SupervisedTransferControl_CurrentStage = "ReconnectMessage";

          return(true);

          break;

        default:

            break;

    }

    return(false);

}

//

// ReconnectMessage

//

function ReconnectMessage_ClientActivationFunction() {

    if (SupervisedTransferControl_CurrentStage == "ReconnectMessage") {

        SupervisedTransferControl_CurrentStage = "Complete";

        // If the call was never Held, there is not point issuing a 

       // ReconnectMessage, so we can call the failure handler immediately

       if (SupervisedTransferControl_CallWasHeld == true) {

           return(true);

       } else {

SupervisedTransferControl_OnClientFailure(SupervisedTransferControl_FailureCause);

            return(false);

        }

    } else {

        return(false);

    }

}

function ReconnectMessage_OnClientBeforeSend() {

    var callID = RunSpeech.CurrentCall().Get("CallID");

    var deviceID = RunSpeech.CurrentCall().Get("DeviceID");

    return(

    "<ReconnectCall xmlns=\"https://www.ecma.ch/standards/ecma-323/csta/ed2\">" 

    + 

    "    <heldCall>" + 

    "        <callID>" + callID + "</callID>" + 

    "        <deviceID>" + deviceID + "</deviceID>" + 

    "    </heldCall>" + 

    "    <activeCall>" + 

    "        <callID>" + SupervisedTransferControl_NewCallId + "</callID>" + 

    "        <deviceID>" + deviceID + "</deviceID>" + 

    "    </activeCall>" + 

    "</ReconnectCall>");

}

function ReconnectMessage_OnClientError() {

   SupervisedTransferControl_FailureCause = "SmexError";

   SupervisedTransferControl_OnClientFailure

   (SupervisedTransferControl_FailureCause);

   window.close();

}

function ReconnectMessage_OnClientReceive(smexObj, docEl) {

    switch(docEl.baseName) {

        case "ReconnectCallResponse":

            break;

        case "RetrievedEvent":

            // If we reach RetrievedEvent, we've recovered from whatever went 

            wrong

            // and we can call the failure handler and then return control to 

            the 

            // 'calling' page.

            SupervisedTransferControl_OnClientFailure

            (SupervisedTransferControl_FailureCause);

            SupervisedTransferControl_CurrentStage = "Complete";

            return(true);

            break;

        // A FailedEvent or CSTAErrorCode both mean we've failed to even          reconnect

        // the held call, which is essentially fatal, and we might as well just

        // do a window.close to free up resources.

        case "FailedEvent":

            SupervisedTransferControl_OnClientFailure

            (SupervisedTransferControl_FailureCause);

            SupervisedTransferControl_CurrentStage = "Complete";

            window.close();

            return(true);

            break;

        case "CSTAErrorCode":

            SupervisedTransferControl_OnClientFailure

            (SupervisedTransferControl_FailureCause);

            SupervisedTransferControl_CurrentStage = "Complete";

            window.close();

            return(true);

            break;

        default:

            break;

    }

    return(false);

}

// ]]>

</script>

<br>

1	ECMA International, the European Computer Manufacturers Association, is an international industry association dedicated to the standardization of information and communication technology.
2	The SASDK is a key part of Microsoft Speech Server and can be used by Visual Studio® .NET developers to create speech-enabled IVR applications.
3	Please refer to your TIM documentation or vendor for more information
4	Speech applications built by developers using the SASDK generate SALT elements and client-side code, including the dialog manager RunSpeech. RunSpeech is client-side code that manages the order in which speech controls, speech dialog controls, and call controls are activated.
5	connectionID is a CSTA element type comprised of both a callID and a deviceID. A variety of elements can be of type connectionID, including <existingCall>, <heldCall>, <initiatedCall>, and many others.
6	The full schema for these extensions can be found in the SASDK Help documentation.
7	This can also be used for the MakeCall service.
8	The private schema for these extensions can be found in SASDK Help documentation.
9	Please consult your TIM documentation or vendor for enabling and disabling call analysis.
10	A full list of causes is available in section 9.18 in the ECMA-323 specification.
11	For more information on how specific TIMs handle this, please refer to the relevant TIM vendor or documentation.
12	Please consult with your TIM vendor or documentation for more information.

Telephony Call Control with Microsoft Speech Server 2004

On This Page

Introduction

Speech Server and Telephony Processing

Call Control Using CSTA and Smex

Basic Call Controls

AnswerCall Control

DisconnectCall Control

MakeCall Control

TransferCall Control

The SmexMessage Control

Call Control Processing in Speech Server-based Applications

Call Signaling Support

Call Information and Signaling Protocols

Call Transfer Support

Consultation Calls, Supervised Transfers, and Conferences

CSTA XML Flow for Consultation Calls, Supervised Transfers, and Conferences

CSTA for Consultation Calls

Consultation Call Request

Speech application ? TIM

Consultation Call Response

TIM ? Speech application

Error Response

TIM ? Speech application

Held Event

TIM ? Speech application

Established Event

TIM ? Speech application

Failed Event

TIM ? Speech application

Media operations during consultation calls

Far-end disconnect during consultation calls

Connection Cleared Event

TIM ? Speech application

CSTA for Supervised Transfers

Transfer Call Request

Speech application ? TIM

Transfer Call Response

TIM ? Speech application

Transferred Event

TIM ? Speech application

CSTA for Conference Calls

Conference Call Request

Speech application ? TIM

Conference Call Response

TIM ? Speech application

Conference Event

TIM ? Speech application

Media operations during conference calls

Detecting dropped connections on a conference call

Dropping the consulted party on a conference call

Clear Connection Request

Speech application ? TIM

Clear Connection Request

Speech application ? TIM

CSTA for Reconnect Calls

Reconnect Call Request

Speech application ? TIM

Reconnect Call Response

TIM ? Speech application

Connection Cleared Event

TIM ? Speech application

Retrieved Event

TIM ? Speech application

Sample Code for Consultation Calls, Supervised Transfers and Conferences

Supervised Transfer Call Control

ConsultationMessage Control

TransferMessage Control

ReconnectMessage Control

Conference Call Sample Code

Conclusion

Glossary

Appendix A – SupervisedTransfer Call Control Code

Additional resources