Communications and Collaboration
How Voice Powers OCS 2007
At a Glance:
- Making calls in OCS 2007
- How calls are secured
- Multimodal conversations in OCS
- Integrating with Exchange UM for voicemail
The first part in this series on Microsoft Office Communication Server (OCS) 2007 showed how OCS was built on the strengths of Live Communication Server (LCS) 2005 to deliver
enhanced enterprise-class instant messaging (IM) and presence capabilities, as well as advanced media and phone functionality (see the February 2008 issue of TechNet Magazine at technet.microsoft.com/magazine/cc194409). This article continues with a deeper dive into the Voice over IP (VoIP) aspect of the story. I will explain how simple voice calls are made in the OCS system and discuss the technology layer by layer, so each component is added along with the corresponding functionality.
As the previous article noted, OCS can be set up to provide telephony to users in a number of different configurations:
Enterprise Voice This is the full Unified Communications solution that uses OCS along with the Microsoft® Mediation server without requiring a PBX. Exchange Unified Messaging (UM) provides voicemail features in this system. I'll refer to users who have this service as "UC users" in the remainder of this article. I'll also use the term "UC endpoint" to refer to a client in this configuration.
Enterprise Voice with PBX Integration This configuration lets users reap the benefits of unified communications while retaining the existing PBX phones on their desktops. It allows OCS and PBX to be set up in parallel so incoming calls can ring Office Communicator and PBX endpoints simultaneously. The PBX still owns the call routing and still provides the voicemail services.
Remote Call Control This feature uses the PBX phone as the primary phone and allows the phone to be controlled via the Office Communicator client.
Basic Voice Calls
I'll focus here on the workings of Enterprise Voice users. Simple voice calls are set up in the OCS system using the SIP INVITE mechanism as described in RFC 3261. The OCS server plays the role of a proxy similar to a postmaster that relays messages between clients (or endpoints). New SIP INVITEs are originated by clients whenever a real-time session, such as a call or instant messaging session, is created. If the INVITE is acknowledged with an answer (that is, the remote endpoint sends a 200 OK response), the call is established (see Figure 1).
INVITE sip:email@example.com SIP/2.0 To: <sip:firstname.lastname@example.org> From: <sip:email@example.com>;tag=5c5ffe5428;epid=d793aff63a Call-ID: 3522acd5acd349b4855871e3100a5f4f CSeq: 2 INVITE Contact: sip:firstname.lastname@example.org Content-Type: application/sdp Content-Length: 156 **Note: Alice Audio SDP payload not shown** SIP/2.0 200 OK To: <sip:email@example.com>;tag=f5c728454a;epid=e73443245 From: <sip:firstname.lastname@example.org>;tag=5c5ffe5428;epid=d793aff63a Call-ID: 3522acd5acd349b4855871e3100a5f4f CSeq: 2 INVITE Contact: sip:email@example.com Content-Type: application/sdp Content-Length: 160 **Note: Bob Audio SDP payload not shown**
In Figure 2, Alice calls Bob by choosing Communicator Call from Office Communicator (OC) 2007, and OC 2007 originates an INVITE to Bob's SIP Uniform Resource Identifier or URI (sip:firstname.lastname@example.org). The INVITE carries the audio session descriptor (called SDP, for Session Description Protocol) of the media endpoints where Alice can receive audio. OCS forks the INVITE to Bob's registered SIP endpoints (for example, Communicator Phone Edition and Communicator desktop). The INVITE contains a From header of sip:email@example.com; this is used by Bob's client endpoints in a Reverse Name Lookup (RNL) to find the name (Alice) to show in the incoming call notification.
Figure 2 Routing a call to multiple endpoints (Click the image for a larger view)
The Globally Routable User Agent URI (GRUU) is shown for each endpoint. The GRUU uniquely identifies a SIP endpoint and is obtained during the registration process from the OCS server. The GRUU address helps in routing SIP messages, such that once the call is answered by an endpoint, subsequent SIP signaling for other mid-call operations can be carried between the endpoints directly using the GRUU address.
Figure 3 continues the process, in which Bob answers from the Communicator desktop. Bob's OC sends a 200 OK message telling where he can receive audio. As soon as the OCS proxy detects that one endpoint has answered, it cancels the call to the other Communicator Phone Edition endpoints. Once the 200 OK response reaches Alice's Communicator endpoint, both Communicator endpoints have sufficient information (IP ports, encryption parameters, and so forth) to start the media.
Figure 3 Answering from one endpoint (Click the image for a larger view)
Using Phone Numbers
So far we have seen the user clicking on Communicator Call to create the invitation. The process is a little different when the user selects or types in a phone number:
- The client first normalizes the phone number and represents it as a TEL URI, which describes resources identified by telephone numbers as indicated in RFC 3966.
- Since the OCS server only recognizes SIP URIs, clients convert the TEL URI to a corresponding SIP URI by adding the domain suffix and a user=phone tag.
- If the number corresponds to an internal user, OCS routes the invitation directly. If the number is external, the invitation is routed to the nearest SIP-PSTN gateway.
Some calls to a public switched telephone network (PSTN) number require the media path be set up before the call is answered so that remote announcements can be played or additional digits collected to complete the call. In such scenarios, the PSTN gateway sends a 183 Session Progress indicator with an audio SDP. Communicator uses this information to set up a two-way media path with one destination endpoint before the call is answered by the remote user.
Once the early media path is all set up, Communicator enables the Dual-Tone Multi-Frequency or DTMF (the telephone signaling tones) keypad so that the user can enter any additional digits required by the remote system. Any DTMF digits entered are sent in the media path as RFC 2833 packets. The PSTN gateway takes care of generating the appropriate DTMF tone signals on the PSTN side.
Note that if Bob has enabled Simultaneous Ring to his cellular phone, the OCS proxy would route the call to the gateway and ring Bob's other Communicator endpoints at the same time. In such scenarios, the OCS proxy indicates that forking is active in the 183 Session Progress response, and this makes Communicator establish a receive-only media channel with the PSTN gateway.
One of the key features of voice calls set up by OC is that media is encrypted by default using the Secure Real-Time Transport Protocol (SRTP), as defined in RFC3711. SRTP provides confidentiality, message authentication, and replay protection to RTP traffic. During the call setup, clients negotiate security capabilities between themselves and exchange cryptographic keys as part of the INVITE mechanism.
The default encryption setting for OC is "optional," and this allows two OC endpoints to set up an encrypted media channel. This setting can be adjusted by the administrator to suit the organization's compliance needs. For example, it can be made tighter to force encryption on all calls, or it can be turned off altogether.
Traversing NATs and Firewalls
Clients in the OCS system use the Interactive Connectivity Establishment (ICE) technology to provide media connectivity to users behind Network Address Translation (NAT) devices and firewalls, without requiring any changes to the existing NAT components. ICE technology is currently being standardized by the Internet Engineering Task Force (IETF). Each client is aware of the audio/video (A/V) Edge server that serves it through an inband provisioning mechanism and maintains an authenticated link to the A/V Edge during sign-on.
Before making a call, the client allocates resources on the possible connectivity locations (the addresses and ports, which are also known as candidates) on the A/V Edge server (for media relay), on the NAT, or on the host client itself. When the Session Initiation Protocol (SIP) INVITE is sent out, it carries this connectivity information as part of the INVITE. The 200 OK answer carries similar candidate information about the peer. Once each endpoint has a list of peer candidates, an elaborate ranking and checking mechanism then selects the most optimal path between the two peers where media is guaranteed to succeed.
Routing on Phone Numbers
Routing on phone numbers introduces some complexities to the basic call mechanism, due to the following factors:
- Organizations may have different dial plans that are deployed internally for short dialing.
- Numbers may be stored in non-standard formats (for example, users may have saved seven-digit strings in Microsoft Office Outlook®).
- There may be different policies mapped to different outbound numbers. For example, international number dialing can be blocked for certain users.
The OCS system requires phone numbers to be in the RFC 3966 TEL URI format before they can be routed correctly. Numbers that are not in this format are converted before the client issues an INVITE. The diagram in Figure 4 show how this happens within the OCS system.
Figure 4 Routing telephone numbers (Click the image for a larger view)
Numbers available to the client may be from various sources. Pre-normalized numbers are from the Address Book Service (ABS), which can have admin-defined normalization rules for converting numbers to a normalized E.164 format. Once a client issues a SIP INVITE to the normalized number, OCS applies a translation process to map the number to any internal user.
- Step 1 shows that numbers entering the system are either unique (normalized, following the E.164 numbering plan) or with a phone-context that identifies the location. These numbers are sent to the server as part of the SIP INVITE. OCS routes the INVITE to a translation process.
- The translation process (Step 2) attempts to map the phone number to a UC endpoint using server-side RNL. The translation can identify a route for that number to an outbound routing, or to a UC user, or it can fail the call with a 4xx code if the number cannot be translated correctly.
- If the number is translated to a non-UC or external number, it is sent to the outbound routing component, which redirects the INVITE to the appropriate SIP-PSTN gateway for further processing, after applying policies for the number related to the caller (Step 3A). Outbound routing takes care of balancing the call load across gateways or failing over to alternate gateways if necessary. The outbound routing process can fail or reject the call and return a SIP 403 response code if access is forbidden for a certain number. Note that outbound routing only applies when the caller is a UC user. When that is not the case, OCS tries to use the static routes to gateways set up for that URI.
- If the number is translated to a UC user's number, the INVITE is routed to the SIP URI of that user (Step 3B). Inbound routing is a function of OCS that applies only to UC users and to all calls targeted to the SIP URI. As you'll see, inbound routing applies rules for ringing timeouts, call forwarding, and voicemail forwarding for the UC user.
Note that the number that results from the normalization process can vary depending on client location. Administrators can configure location profiles and assign number conversion rules specific to a certain location (such as how four-digit dialing will work in that location). Each UC user is assigned a location profile, and all clients in the system download the rules specific to their own location profiles using inband provisioning. I'll take a look at the details of Inbound routing next.
Inbound routing rules specify how calls to a user should be routed in the presence or absence of registered clients in the system. The inbound routing component also takes care of applying presence-based rules to the incoming call; for example, it can send incoming calls to voicemail if the user has set the presence state to Do Not Disturb. Inbound routing is aware of the presence container levels and automatically rejects calls from users in blocked containers. Figure 5 shows a summary of options supported by the OCS inbound routing component.
|Ring Duration||Default is 20 seconds. User can change this to 60 seconds max. Calls are diverted to unanswered calling destination after this timeout duration.|
|Route Unanswered Calls to Voicemail||Default if the user is enabled for voicemail. Call is routed according to Inbound Routing rules.|
|Generate Missed Call Notifications when caller hangs up before call reaches voicemail||Notifies Exchange UM of such missed calls.|
|Call Blocking||Rejects calls from blocked callers (only calls from users with SIP identities can be blocked).|
|Do Not Disturb||Routes calls to voicemail. Simultaneous ring destinations are not rung if the user is in this mode.|
|Allow Interrupt List||Allows calls if the caller is in the Team container, even if the user is set to Do Not Disturb.|
|Simultaneous Ring||Configures incoming calls to ring a PSTN phone number as well as Communicator and Communicator Phone Edition clients.|
|Call Forwarding Immediate||Forwards an incoming call immediately to another user, PSTN phone number, or voicemail.|
|Call Forwarding Unanswered||Forwards an unanswered call to another user, PSTN phone number, or voicemail.|
|Working Hours||Uses the working hours configured on the Outlook Calendar for activating call forwarding settings for a user.|
Inbound routing rules are uploaded to the server as an XML schema as part of a user's self-provisioning information. Figure 6 shows how inbound routing works. An incoming call for a user in the OCS system rings the user by default (shown as "Me"). If the call is not answered within the ring duration, the unanswered call is sent to voicemail by default. The user can choose to modify the default configuration by choosing immediate forwarding to a number, to another person, or to voicemail directly.
Figure 6 Routing calls (during working hours) (Click the image for a larger view)
Any calls that are immediately forwarded to another person or number will cause the inbound routing rules for that person or number to take effect. The user can also set unanswered call forwarding to another number or person.
If the user checks the Apply only during my Outlook Working Hours option in the forwarding rules, then the rules shown in Figure 6 are applied during the working hours, and the default behavior of ringing the registered endpoints is applied outside those hours. Note that this behavior requires that the organization has deployed Microsoft Exchange Server 2007 and Outlook 2007 clients because Communicator uses working hours information from the Exchange 2007 Availability Web service and leverages Outlook 2007 auto-discovery support to get the server location.
Quality Reporting and Troubleshooting
Although clients in the OCS 2007 system use RTAudio, a newer-generation audio codec that can tolerate imperfect network conditions such as jitter, packet loss, and so forth, monitoring is still essential so administrators can locate and fix potential hotspots. Clients in the OCS 2007 system report the quality of each call and provide detailed statistics including bandwidth, loss, jitter, mean opinion score (a measure of user perception), devices used, and more to a central Quality of Experience (QoE) server at the end of each call. This is done by sending the payload in a SIP SERVICE request to the QoE server, whose address is programmed using inband provisioning mechanism.
Note that all client endpoints report call quality to the QoE server independently. Should a call fail due to an error in the signaling, clients report the same to OCS as well, which provides a repository of all the errors created by clients.
Putting It All Together
Now that we have walked through several aspects of setting up a call, including entering numbers, inbound and outbound routing, and connectivity checks, let's take a look at the process as a whole. Figure 7 summarizes what happens from the time a user makes a call. Clients in the OCS system play a critical role in setting up a call, and they manage the entire call setup process. OCS is involved in initial routing of the call, and edge servers help the clients find the media route that is most optimal. Now let's look at what happens next—the conversation.
Figure 7 End to-end call flow(Click the image for a larger view)
Multimodal Behavior and Conversations
The concept of conversations is central in the OCS system. A conversation is a multimodal session between two or more people and can include voice, video, and IM at the same time, as well as artifacts such as file transfers during the session, e-mail in Outlook or notes stored in Microsoft Office OneNote®.
There are several aspects of conversations that can have an impact on the design of OCS client systems:
- All modalities involved in a conversation use the same window. If an instant message is added to an audio conversation, it is associated with the same window.
- Conversations can occur across devices. Users can choose to have audio on Communicator Phone Edition but have IMs on Communicator desktop.
- All modes escalate to a conference together: if escalation occurs, all modalities of a conversation must be escalated together, no matter which device the modality is currently on. For example, if an audio mode is available on Communicator Phone Edition, and an IM mode for the same conversation is available on Communicator desktop, then escalation to conference ensures that both audio and the IM have the same set of participants for that conversation.
- Conversations, not individual sessions, are logged. Call logs contain full information about the conversation, including details about notes taken during the conversation, instant message exchanges, how long the audio lasted, and so on. This gives users a handy consolidated view of the entire conversation when they open the call log in Outlook.
- Conversations can continue from the conversation logs. OCS clients tie conversations together using a conversation ID (a number that uniquely identifies the conversation across devices and apps. There is an elaborate mechanism to compute deltas of conversations when new conversations are started from a history log of an old conversation. The same conversation ID is also stored as the e-mail's ID in Outlook.
The conversation ID travels as part of the SIP INVITE as a custom property, Ms-Conversation-Id. Figure 8 shows such a transaction where voice is added first, followed by an IM. Note that at the end of the multimodal conversation, the history log that is stored in Outlook has the same conversation ID.
Figure 8 Multimodal conversations and conversation ID (Click the image for a larger view)
Exchange UM is the voicemail system for OCS 2007. OCS will only route calls for users that are UC-enabled to Exchange UM. OCS provides several features for integrating with Exchange UM:
- The Call Voice Mail option from Communicator's Phone UI (see Figure 9) allows the user to manage the UM mailbox, change greetings, and so forth without having to enter the PIN because he is already authenticated.
- The message waiting indicator in the Communicator UI (see Figure 10) allows the user to open the voicemail folder in Outlook while the one in Communicator Phone Edition lets the user play the voice messages directly from the phone.
- Calls can be forwarded to voicemail from Communicator's UI.
- Declined calls are routed automatically to voicemail.
- Play on Phone from the voicemail item in Outlook rings Office Communicator clients directly.
Figure 9 Call Voice Mail option in Communicator
Figure 10 Message waiting indicator in Communicator
The mechanism with which OC receives the voicemail notification is unlike the Communicator Phone Edition. OC registers for new mail notifications on the Voicemail search folder in Outlook and reports new messages in this folder. Communicator also uses Missed Conversations and Call Logs.
I've shown here how voice calls work in OCS 2007, a system built upon SIP that leverages several RFCs. Client endpoints play a central role in managing calls in OCS. All voice calls are securely encrypted by default. OCS provides flexible components to allow manipulating numbers and managing their flow throughout the system.
Conversations are a central concept in OCS, involving voice, IM, and video. OCS integrates with Exchange UM and allows Communicator endpoints to be notified of voicemail, and to access voicemail through Outlook, or directly from the server.
Rajesh Ramanathan has worked in communications for 14 years and has designed voice protocols, user experiences, and voice and conferencing for Office Communicator 2007. He currently works as Lead Program Manager on the Office Communicator team at Microsoft. Rajesh can be reached at firstname.lastname@example.org.
© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.