About link translation

Web pages returned from a Web server published by a Microsoft Forefront Threat Management Gateway Web publishing rule may include links containing internal names of computers or Web sites and internal paths to Web content. Because external clients cannot resolve these internal names, these links are broken unless the internal names are replaced with the public names of published Web sites. Forefront TMG includes a built-in Web filter named Link Translation Filter, which uses mappings to translate internal names in links on Web pages to publicly resolvable names. Each mapping translates an internal URL (or part of a URL) to a public equivalent. For example, a mapping can translate the internal URL https://team to the public URL https://www.team.contoso.com for external users. Link translation mappings are stored in tables called link translation dictionaries.

When link translation is enabled for a Web publishing rule, a default link translation dictionary is automatically created for each public name of the rule that does not contain a wildcard character (*).

Forefront TMG includes link translation support for all published Web content, including support for Web publishing rules that publish servers running Microsoft Exchange Server and Microsoft Office SharePoint Server. Link translation is not applied to rules that publish FTP servers over HTTP.

Types of mappings

When link translation is enabled for a Web publishing rule, links in content sent from the published Web site to a client are translated according to the following mappings, which are stored in the effective link translation dictionary for the rule:

  • Implicit mappings of the rule—These mappings are added automatically and map the internal name (or IP address) of the server published by the Web publishing rule to the public name (or IP address) of the Web site, or if there are multiple public names, to one of its public names.
  • Local mappings—The user creates these mappings for the rule, and they map a string containing an internal host name to a string containing a publicly resolvable host name. The string to be translated must contain at least four characters. A local mapping can override an implicit mapping of the rule. Local mappings are not added automatically to the effective link translation dictionaries of other rules.
  • Implicit mappings of other rules—These mappings are automatically added to the effective link translation dictionary of every Web publishing rule that is defined and enabled and has link translation enabled on the Forefront TMG computer. These mappings are derived from the implicit mappings defined in each Web publishing rule on the Forefront TMG computer.
  • Global mappings—The user creates these mappings for the Forefront TMG computer, and they apply to all Web publishing rules in the Forefront TMG computer. These mappings override conflicting implicit mappings of other rules.

Adding local mappings

Local mappings can be defined for a Web publishing rule on the Locally Defined Mappings page. To open this page, on the Properties page for the rule, on the Link Translations tab, click Configure. To create a local mapping, you need to specify a string containing the internal name (or IP address) of a Web site or host and a string containing the publicly resolvable name to which the internal name should be translated. The translated name is typically the public name that can be accessed by external clients, such as the fully qualified domain name (FQDN) or IP address of the Forefront TMG computer. Note that in the same rule, you cannot define more than one local mapping for a string to be translated.

Adding global mappings

Global mappings can be defined on the Global Mappings tab of the Link Translation properties. To create a global mapping, you need to specify an internal URL and the public URL to which the internal URL should be translated. The internal URL typically contains the name (or IP address) of an internal Web site or host. The translated URL is typically the public name that can be accessed by external clients, such as the FQDN or IP address of the Forefront TMG computer. The URLs specified in user-defined global mappings must begin with a valid protocol (https:// or https://).

Dictionaries

When link translation is enabled for a Web publishing rule, a default link translation dictionary containing the implicit mappings of the rule is automatically created for the rule. If more than one public name is defined in a Web publishing rule, a dictionary is automatically created for each public name that does not contain a wildcard character (*). An effective link translation dictionary is created when user-defined local and global mappings and implicit mappings defined by other rules are added to the default dictionary.

The implicit mappings created for every Web publishing rule on the Forefront TMG computer and the user-defined local mappings are available to all Web publishing rules defined on the Forefront TMG computer.

When a publishing rule is used to return content from a Web site to a client, it uses the mappings in its effective link translation dictionary to translate links on the response page.

If link translation is enabled for a Web publishing rule, the effective link translation dictionary includes the implicit and local mappings of the rule along with the global mappings defined on the Forefront TMG computer.

Multiple mappings

Forefront TMG uses the effective link translation dictionary of the Web publishing rule that allowed the request for Web content to translate links in the content before returning it to the client.

When the effective link translation dictionary of a Web publishing rule contains multiple mappings for a search string, Forefront TMG selects the mapping that it uses to translate the search string and removes the other mappings of that search string from the dictionary so that only one mapping for each search string remains in the dictionary.

For each search string that has multiple mappings in the effective link translation dictionary of a rule, Forefront TMG first looks for a local mapping. If a local mapping is found for the search string, Forefront TMG leaves the applicable mapping in the dictionary and removes all other mappings for the same search string from the dictionary.

If a local mapping that matches the search string is not found, Forefront TMG looks for a matching implicit mapping derived from the rule (a mapping from the default dictionary of the Web publishing rule). If an implicit mapping is found, Forefront TMG leaves the applicable mapping in the dictionary and removes all other mappings for the same search string.

If a matching implicit mapping derived from the rule is not found, Forefront TMG looks for a global mapping for the Forefront TMG computer. If Forefront TMG finds a global mapping, Forefront TMG leaves the applicable mapping in the dictionary and removes all other mappings for the same search string.

If no match is found, Forefront TMG looks for matching implicit mappings derived from the other Web publishing rules defined on the Forefront TMG computer. If one match is found, Forefront TMG leaves the applicable mapping in the dictionary and removes all other mappings for the same search string. If a Web site with the internal name in the search string is published by more than one rule that uses different public names within the same Forefront TMG computer, Forefront TMG should find more than one mapping. Forefront TMG selects which mapping to retain by using the following order of precedence:

  1. Mappings with a translated URL that contains a public name of the current Web publishing rule.
  2. Mappings with a translated URL that contains the public name of a Web site that is specified in a Web publishing rule that uses the same Web listener as the current Web publishing rule.
  3. Mappings with a translated URL containing the DNS suffix that is closest to the DNS suffix in the public name of the current Web publishing rule. For more information, see Closest DNS suffix#DNS.
  4. Mappings that are derived from a Web publishing rule that is higher on the list of rules in the stored configuration.

URL translation

When a response is returned to a Forefront TMG computer, Forefront TMG searches the response for the strings to be translated that are defined in all the mappings in the effective link translation dictionary of the rule that allowed the request for the Web content before returning it to the client. When a search string is found, Forefront TMG replaces the search string with the corresponding translated string in the mapping.

Forefront TMG only translates complete URLs or partial URLs that are followed by a terminating character, such as a space or a slash. For example, if one of the search strings is https://contoso and the response contains the URL https://contosonews, this URL is not translated by using this mapping, because the search string is not followed by a terminating character in the URL.

If more than one search string is found in the same URL, Forefront TMG translates the URL by using the longest search string. For example, if the effective link translation dictionary of the applicable rule contains mappings with the search strings https://contoso and https://contoso/news and the response contains the URL https://contoso/news/a.htm, Forefront TMG uses the mapping for https://contoso/news to translate this URL.

Link translation can be enabled or disabled for each Web publishing rule. When link translation is disabled for a rule, its implicit mappings are not added to the effective link translation dictionaries of other rules defined on the Forefront TMG computer.

Link translation can be enabled or disabled for a Forefront TMG computer. By default, link translation is enabled.

Link translation can be enabled for a Web publishing rule only if link translation is enabled for the Forefront TMG computer. By default, link translation is enabled for a Web publishing rule when link translation is enabled for the Forefront TMG computer.

Note

Header translation takes place even when link translation is disabled.

Link translation is automatically disabled for Web publishing rules that apply to all Web requests or that have one or more public names containing a wildcard character (*).

Redirection of unpublished sites

In Link Translation properties, on the Link Redirection tab, the user can optionally define a list of URLs of unpublished sites and specify a published URL to which links to these URLs can be redirected. When link translation is enabled on the General tab, links to the URLs of the unpublished Web sites in content returned from a published Web site are redirected to the specified published URL if link translation is enabled for the rule that publishes the specified published URL. When the link redirection feature is enabled, users who request an unpublished site are redirected to the specified URL and should not receive an error page.

If link translation is enabled, Forefront TMG performs another search on the Web content for the unpublished sites after completing the search for the search strings in the effective link translation dictionary of the Web publishing rule that allowed the request for Web content. If the URL of an unpublished site is found, Forefront TMG replaces the URL with the specified published URL.

Content types

In Link Translation properties, on the Content Types tab, the user can select the file name extensions and MIME types to which link translation can be applied. The selected content types apply to all Web publishing rules for which link translation is enabled. By default, if link translation is enabled, the translation is applied only to Web content that belongs to the HTML Documents content type.

Range requests

If the blocking of range requests is enabled on the Forefront TMG computer and link translation is enabled for a rule, range requests for the content types to which the rule applies are blocked for that rule. If the blocking of range requests is disabled on the Forefront TMG computer, link translation is not used for these requests.

Encoding

Forefront TMG link translation is basically a sophisticated search and replace engine. The Web content that passes through a Forefront TMG computer is searched for the mapped strings, and when a mapped string is found, the applicable replacement is made. The search engine is not sensitive to the case of ANSI characters in the strings. For the search engine to correctly locate the search strings, it must know their exact representation in the Web content.

The representation is dependent on the following:

  • Character set—The character set specifies which character table to use (and the corresponding encoding) for interpreting the characters. (For example, in UTF-8, the letter "a" is encoded as 0x61.)
  • Character escaping—Character escaping defines whether letters are represented in their standard form or in the form of an escape sequence. (For example, the letter "a" may be represented as %61.)

Forefront TMG is capable of locating links to be translated by supporting different character sets and using a specific heuristic for dealing with escaped characters.

Forefront TMG does not support escaped characters in text encoded with the Universal Character Set.

Character sets

There are many Web pages that are not encoded in UTF-8. If Forefront TMG would assume that all Web pages use the UTF-8 character set, the link translation search engine may fail to identify and replace links that are not encoded in UTF-8.

The following is a simple example. A user publishes the Web site https://myserver as https://www.contoso.com. Forefront TMG link translation needs to search for https://myserver and replace it with https://www.contoso.com. However, the string https://myserver can be represented in several different ways on a Web page. For example, the character "m" is represented as 0x6D in the UTF-8 character set. However, in UTF-16 encoding, it is represented as 0x006D.

Therefore, Forefront TMG uses the UTF-8 character set and allows the user to select one additional character set in each Web publishing rule. For example, the additional character set can be Japanese (SHIFT+JIS).

Escape encoding

In many cases, the following characters do not appear in their standard form, but rather in the form of an escape sequence:

  • Non-English letters (high-ASCII characters)
  • Slash mark (/), tilde (~), ampersand (&), question mark (?), equal sign (=), semicolon (;), and other (unsafe or reserved) characters, such as a space.

If one or more characters in the URL are escaped, for example, if an ampersand (&) is represented by %26, a simple link translation search fails to identify the URL.

The following is a simple example. A user publishes the Web site https://myserver/contoso finance as https://www.contoso.com/contoso finance. Forefront TMG link translation needs to replace https://myserver/contoso finance with https://www.contoso.com/contoso finance. However, the string https://myserver/contoso finance can actually be represented in the following escaped form: https://myserver/contoso%20finance. In this case, the space was escaped as %20.

Therefore, Forefront TMG has a dedicated heuristic for dealing with escaped characters. It is based on searching for common variants of mapped URLs. For example, if Forefront TMG link translation needs to search for https://myserver?param=a, it also looks for http:%2F%2Fmyserver%3Fparam%3Da.

The probability of success in identifying escaped variants of a URL is increased in the following ways:

  • Using the heuristic for determining the escaped encoded variations so that the majority of the URLs can be found.
  • Allowing the user to specify the exact form of a URL to search for (stating which characters are escaped, for example, http:%2F%2Fmyserver%3Fparam%3Da). This provides a very strong workaround in cases in which the escaping heuristic fails.

Translation of protocols in URLs

Consider a Web page that is returned from a Web server published by one Web publishing rule (Server A) and that contains a link to a Web server published by another Web publishing rule (Server B):

  • If there is only one mapping for Server B with the HTTP protocol or with the HTTPS protocol (according to the Web listener), use the available mapping.
  • If there are mappings for Server B with the HTTP protocol and with the HTTPS protocol:
    • HTTPS links are translated to HTTPS links.
    • HTTP links are translated to HTTPS links if an SSL connection was used to access Server A.

Global mappings can be defined as needed.

When Forefront TMG is configured to direct traffic to a published server over HTTPS, we recommend that the corresponding Web listener be configured to listen only on HTTPS. If you allow users to connect to Forefront TMG over HTTP  and then direct that traffic over HTTPS to a published server, Forefront TMGtranslates HTTPS links to HTTP, which has security implications. This is an issue for a Web listener that listens only on HTTP, or on HTTP and HTTPS.

When a Forefront TMG computer sits behind an external SSL accelerator that receives HTTPS requests sent over the Internet from clients, the SSL accelerator terminates the SSL connections initiated by these clients. The SSL accelerator forwards their requests as HTTP requests to the port configured for sending HTTP requests to the Forefront TMG computer, which then forwards the requests to the published server if the traffic is allowed. When performing link translation or redirecting clients to authentication forms, if the Forefront TMG computer needs to generate links that are directed to itself, Forefront TMG uses the SSL accelerator port specified for the Web listener used in the Web publishing rule and the FQDN of the SSL accelerator from the Host header in each request to format such links with the HTTPS protocol. For example, if the SSL accelerator port specified for the Web listener is set to 4443, the URL has the form https://www.contoso.com:4443/path. If the SSL accelerator port specified for the Web listener is set to 443, Forefront TMG does not include the port number in the URL.

This section discusses considerations for using Forefront TMG link translation with various releases of Exchange Server.

Forefront TMG addresses the following scenario. An internal user sends an e-mail message containing URLs that contain names of internal servers. The recipient accesses the message through a Forefront TMG computer by using Microsoft Office Outlook Web Access for Exchange Server 2003. The URLs viewed by the recipient contain public names. The recipient forwards or replies to the message.

Forefront TMG ensures that the public names remain in the links when the message is forwarded or a reply is sent.

Replies and forwarded Outlook Web Access messages contain the following line:

<!--CURRENT FILE==”IE5” “WIN32” replyforwardnot-->

Forefront TMG uses pattern matching to determine if a line like this is present. When Forefront TMG identifies such a line, it does not perform link translation so that internal recipients receive URLs containing the internal names.

Each element in the COM collection FPCLinkTranslationPatterns is a string of either the form A*B or the form A. In the former case, the search pattern is <!--A*B-->, and in the latter case, the search pattern is <!--A-->. When the predefined element is used, Forefront TMG searches for a string that starts with <!--CURRENT FILE== and ends with replyforwardnote--> and does not contain < or > in between.

Integration with Exchange Server 2007

Forefront TMG does not alter the links in any pages or e-mail messages sent to clients from computers running Microsoft Exchange Server 2007. In particular, the pages reaching an Outlook Web Access client are identical to the pages provided by the Exchange 2007 computer. However, before sending a page to an Outlook Web Access client, the Exchange 2007 computer performs link translation on the URL in every link on the page. Each altered URL has the form https://ExchangeServerPublicName/owa/redir.aspx?URL=OriginalUrl. For example, if the public name of the Exchange 2007 computer is mail.contoso.com, before sending the page to an Outlook Web Access client, the Exchange 2007 computer translates the URL https://hrweb in a link to https://mail.contoso.com/owa/redir.aspx?URL=https://hrweb.

When a user clicks this Outlook Web Access link, the browser sends a GET request to the public name of the Exchange 2007 computer. Because the Forefront TMG computer publishes the Exchange 2007 computer, the Forefront TMG computer receives the request and examines the contents of the URL parameter to determine whether it publishes the server specified in the URL (hrweb in this example). If the Forefront TMG computer does not publish the server, the Forefront TMG computer forwards the GET request containing the URL https://mail.contoso.com/owa/redir.aspx?URL=https://hrweb to the Exchange 2007 computer. If the Forefront TMG computer publishes the server, the Forefront TMG computer adds a second parameter, called TranslatedUrl, to the URL before forwarding the GET request to the Exchange 2007 computer. This parameter contains the public name (or the first of the public names) specified in the Forefront TMG rule that publishes the server specified in the URL parameter of the GET request. If, for example, the public name of hrweb is hr.contoso.com, the GET request forwarded by the Forefront TMG computer contains the URL https://mail.contoso.com/owa/redir.aspx?URL=https://hrweb?TranslatedUrl=https://hr.contoso.com.

When the GET request reaches the Exchange 2007 computer, Exchange 2007 processes the request in one of the following ways:

  • If the URL parameter points to an intranet target that can be proxied, the Exchange 2007 computer replies with the proxied content.
  • If the URL parameter points to an intranet Microsoft Office SharePoint Server document or a file on an intranet Microsoft Windows Server 2003 share, the file is opened in a separate window.
  • If a URL parameter points to an intranet SharePoint Server document library or an intranet Windows Server 2003 share, the document library or Windows Server 2003 share opens inside Outlook Web Access in a new Documents tab view.
  • If the URL parameter points to a target that cannot be proxied and there is no TranslatedUrl parameter, the Exchange 2007 computer sends an HTTP REDIRECT message to the Outlook Web Access client, redirecting it to the original URL.
  • If the URL parameter points to a target that cannot be proxied and the request contains a TranslatedUrl parameter, the Exchange 2007 computer sends an HTTP REDIRECT message to the Outlook Web Access client, redirecting it to the URL specified in the TranslatedUrl parameter.

Exchange 2007 uses the TranslatedUrl parameter in two more scenarios:

  • The Outlook Web Access client opens a SharePoint Server document library. If the Forefront TMG computer publishes the SharePoint Server site, Forefront TMG adds the TranslatedUrl parameter. Exchange 2007 sees the parameter and adds a yellow Information Bar in Internet Explorer that points to the public name of the document library.
  • The Outlook Web Access client enters a URL in the Open Location dialog box. If the Forefront TMG computer publishes the URL, Forefront TMG adds the TranslatedUrl parameter. If the Exchange 2007 computer cannot proxy the content, it redirects the user to the TranslatedUrl parameter exactly as it does when a link to the same URL is clicked inside an Outlook Web Access e-mail message.

The Link Translation Filter checks the Content-Type header of the response to determine whether it needs to perform translation on the body of the message. By default, link translation only operates on the HTML Documents content type, but you can specify other content types. If no Content-Type header is present, the filter looks for a Content-Location header to determine whether it should perform link translation. If neither header is present, the filter looks at the file name extension of the requested URL.

Note

When the Microsoft Outlook Web Access server or the Forefront TMG computer listens for requests on nonstandard ports and the configured bridging mode is Secure connection to mail server, you must enable link translation for a content type that includes the following MIME types and file name extensions:

  • application/x-javascript
  • text/css
  • text/x-component
  • text/xml
  • .eml
  • .css

You must create a new content type or modify an existing content type to include these MIME types and file name extensions.