Regular Expressions in Transport Rules

Microsoft Exchange Server 2007 will reach end of support on April 11, 2017. To stay supported, you will need to upgrade. For more information, see Resources to help you upgrade your Office 2007 servers and clients.

 

Applies to: Exchange Server 2007, Exchange Server 2007 SP1, Exchange Server 2007 SP2, Exchange Server 2007 SP3

This topic describes the implementation of regular expressions that can be used with predicates on transport rules. Predicates are used by conditions and exceptions to determine whether a configured action or actions should be applied to an e-mail message.

For more information about transport rules, see Overview of Transport Rules.

What Are Regular Expressions?

First, you must understand what a simple expression is. A simple expression represents a specific value that you want to match with a condition or exception. An example of a simple expression is the title of a document that your organization does not want to be distributed outside the organization. A piece of data in an e-mail message must exactly match a simple expression to satisfy a condition or exception in transport rules.

A regular expression is a concise and flexible notation for finding patterns of text in a message. The notation consists of two basic character types: literal (normal) text characters, which indicate text that must exist in the target string, and metacharacters, which indicate or control how the text can vary in the target string. You can use regular expressions to quickly parse e-mail messages to find specific character patterns.

The ability to find patterns of text in an e-mail message enables you to match predicates against data in messages that can change dynamically. Examples of such data are Social Security Numbers (SSN) and patent numbers. You cannot reasonably match this data with a simple expression because a simple expression requires that you enter every variation of the value that you want to detect. By using regular expressions, you can configure the predicate to search for the pattern of SSNs or patent numbers in a message.

You can use regular expressions in any condition or exception rule predicate, except "is a delivery report". For more information about which predicates accept regular expression pattern matching, see Transport Rule Predicates.

Implementing Regular Expressions

In the Exchange Management Shell, you can use regular expressions in any predicate that accepts the Patterns predicate property. In the Exchange Management Console, you can use regular expressions with any condition or exception that contains the words with text patterns. Table 1 lists all the pattern strings that you can use to create a pattern-matching regular expression.

Warning

You must carefully test the regular expressions that you construct to make sure that they yield the expected results. An incorrectly configured regular expression could yield unexpected matches and cause unwanted transport rule behavior. Test your regular expressions in a test environment before you implement them in production.

Table 1   Pattern strings

Pattern string Description

\S

The \S pattern string matches any single character that is not a space.

\s

The \s pattern string matches any single white-space character.

\D

The \D pattern string matches any non-numeric digit.

\d

The \d pattern string matches any single numeric digit.

\w

The \w pattern string matches any single Unicode character categorized as a letter or decimal digit.

|

The pipe ( | ) character performs an OR function.

*

The wildcard ( * ) character matches zero or more instances of the previous character. For example, ab*c matches the following strings: ac, abc, abbbbc.

( )

Parentheses act as grouping delimiters. For example, a(bc)* matches the following strings: a, abc, abcbc, abcbcbc, and so on.

\

The backslash ( \ ) is the escape character that is used together with a special character. Special characters are the following characters that are used in pattern strings:

  • Backslash: \

  • Pipe: |

  • Asterisk: *

  • Opening parenthesis: (

  • Closing parenthesis: )

  • Caret: ^

  • Dollar: $

For example, if you want to match a string that contains (525), you would type \(525\).

\\

Two backslashes are used when you want the backslash character to be recognized as a backslash and not as an escape character. For example, if you want to match a string that contains \d, you would type \\d.

^

The caret ( ^ ) character indicates that the pattern string that follows the caret must exist at the start of the text string that is being matched. For example, ^fred@contoso matches fred@contoso.com and fred@contoso.co.uk but not alfred@contoso.com.

This character can also be used with the dollar ( $ ) character to specify an exact string to match. For example, ^kim@contoso.com$ matches only kim@contoso.com and does not match anything else, such as kim@contoso.com.au.

$

The dollar ( $ ) character indicates that the preceding pattern string must exist at the end of the text string that is being matched. For example, contoso.com$ matches adam@contoso.com and kim@research.contoso.com, but does not match kim@contoso.com.au.

This character can also be used with the caret ( ^ ) character to specify an exact string to match. For example, ^kim@contoso.com$ matches only kim@contoso.com and does not match anything else, such as chris@sales.contoso.com.

By using Table 1, you can construct a regular expression that matches the pattern of the data that you want to match. Working from left to right, examine each character or group of characters in the data that you want to match. Read the description of each pattern string to determine how it is applied to the data that you are matching. Then, determine which pattern string in Table 1 represents that character or group of characters, and add that pattern string to the regular expression. When you are finished, you will have a fully constructed regular expression.

For example, the following regular expression matches North American telephone numbers in the formats 425 555-0100 and 425.555.0100:

425(\s|.)\d\d\d(-|.)\d\d\d\d

You can expand on this example by adding the telephone format (425) 555-0100, which uses parentheses around the area code. The following regular expression matches all three telephone number formats:

(\\()*\d\d\d(\\)|\s|.)\d\d\d(-|.)\d\d\d\d

You can analyze the previous example as follows:

  • (\\()*   This portion makes the first parentheses optional. Because the closing parenthesis is also a regular expression delimiter, it must be escaped by using two backslashes \\. The surrounding (()) parentheses group the \\( characters together so that the wildcard character * can act upon the \\( characters to make them optional.

  • \d\d\d   This portion requires that exactly three numeric digits appear next.

  • (\\)|\s|.)   This portion requires that an opening parenthesis, a space, or a period exist after the three-digit number. Each character-matching string is contained in the grouping delimiters and is separated by the pipe character. This means that only one of the specified characters inside the grouping delimiters can exist in this location in the string that is being matched.

  • \d\d\d   This portion requires that exactly three numeric digits appear next.

  • (-|.)   This portion requires that either a hyphen or period exists after the three-digit number. Because the hyphen and period exist in the grouping delimiters, only one of the two characters can exist in this location in the string that is being matched.

  • \d\d\d\d   This portion requires that exactly four numeric digits appear next.

An Example of a Transport Rule That Uses a Regular Expression

The following example shows how you can use regular expressions when you create a new rule in the Exchange Management Shell:

To create a transport rule that uses regular expressions to match Social Security Numbers in the subject of an e-mail message

  1. Run the following commands:

    $Condition = Get-TransportRulePredicate SubjectMatches
    $Condition.Patterns = @("\d\d\d-\d\d-\d\d\d\d")
    $Action = Get-TransportRuleAction RejectMessage
    $Action.RejectReason = "The transmission of Social Security Numbers is prohibited."
    New-TransportRule -Name "Social Security Number Block Rule" -Conditions $Condition -Actions $Action 
    
  2. Run the following command to view the new transport rule:

    Get-TransportRule "Social Security Number Block Rule" | Format-List
    

When this Get-TransportRule command is run, the following information is displayed:

Identity           : Social Security Number Block Rule,753ed939-1227-4b2a-a8e0-ec49b0615f30
Name               : Social Security Number Block Rule
RuleCollectionName : Transport
Priority           : 0
Comments           :
ManuallyModified   : False
Conditions         : {SubjectMatches}
Exceptions         :
Actions            : {RejectMessage}
State              : Enabled
IsValid            : True
ObjectState        : Unchanged

For More Information

For detailed syntax and parameter information about each command, see the following topics:

For more information about transport rules, see the following topics: