About keyword filter list syntax rules


Applies to: Forefront Protection for Exchange

Topic Last Modified: 2010-05-11

The following are the syntax rules for a keyword filter list. Be careful to use the appropriate syntax because Forefront Protection 2010 for Exchange Server does not perform validation. If the filtering results are not what you are expecting, it is recommended that you double-check your syntax.

  • Each item (line of text) is considered a search query.

  • Queries use the OR operator. It is considered to be a positive detection if any entry is a match.

  • Queries are comprised of operands (keywords), which are text tokens or a string of text tokens, such as:

    • apple (means that the text contains “apple”)

    • apple juice (means that the text contains “apple juice”)

    • get rich quick (means that the text contains “get rich quick”)

  • Queries may also contain operators that precede or separate operands in an expression.

  • An expression may be comprised of a single operand, an operand preceded by the _NOT_ or _HAS[#]OF_ operators, or two operands joined by the _AND_, _ANDNOT_, or _WITHIN[#]OF_ operators.

    The following logical operators are supported in expressions. There must be a space between an operator and an operand (or another operator), represented in the examples by the • character:

    • _AND_ (logical AND). For example, apples•_AND_•oranges. A filter such as this would be matched if the text contains both “apples” and “oranges”.

    • _NOT_ (negation). For example, _NOT_•oranges. A filter such as this would be matched if the text does not contain “oranges”.

    • _ANDNOT_ (logical AND negation). For example, apples•_ANDNOT_•oranges. A filter such as this would be matched if the text contains “apples” but does not contain “oranges”. _ANDNOT_ is functionally equivalent to _AND_•_NOT_.

    • _HAS[#]OF_ (frequency). Specifies the minimum number of times that the text must appear in order for the query to be considered true. For example, _HAS[4]OF_•get rich quick. If the phrase "get rich quick" is found in the text four or more times, this query is true. This operator implicitly has a default value of 1 when it is not specified.

    • _WITHIN[#]OF_ (proximity). If the two terms are within a specified number of words before or after each other, there is a match. For example, free•_WITHIN[10]OF_•offer. If "free" appears within 10 words before or after "offer", this query is true. WITHIN[0]OF_ ignores the distance between the keywords and behaves as the _AND_ operator. In this case, the filter is matched if both keywords are present.

    Multiple operators are permitted in a single query. The precedence of the operators is (from highest to lowest):

    • _WITHIN[#]OF_

    • _HAS[#]OF_

    • _NOT_, _AND_, and _ANDNOT_ (these are at the same precedence level because they are used in conjunction when part of an expression)

    This precedence cannot be overridden with parentheses. Other considerations are:

    • The logical operators must be entered in uppercase letters.

    • Phrases may be used as keywords. For example, apple juice or get rich quick. Quotation marks are not used.

    • Multiple blank spaces (blank characters, line feed characters, carriage return characters, horizontal tabs, and vertical tabs) are treated as one blank space for matching purposes. For example, A••••B is treated as A•B and matches the phrase A•B.

    • In HTML-encoded message texts, punctuation (any non-alphanumeric character) is treated as a word separator similar to blank spaces. Therefore, words surrounded by HTML tags can be properly identified by the filter. However, note that the filter '<html>' will match '<html>', but not 'html'.

Examples (the • character represents a space):

  • apples•_AND_•oranges•_AND_•lemons•_WITHIN[50]OF_•juice

    This expression means that “apples”, “oranges”, and “lemons” all appear at least once, and that “lemons” is within 50 words of “juice”.

  • confidential•_WITHIN[10]OF_•project•_AND_•banana•_WITHIN[25]OF_•shake

    This expression means that “confidential” is within 10 words of “project”, and that “banana” is within 25 words of “shake”.

  • _HAS[2]OF_•get rich•_WITHIN[20]OF_•quick

    This expression means that “get rich” appears at least 2 times within 20 words of “quick”.