Guidelines for Using the FOR XML Clause

The FOR XML clause can be used in top level queries and in sub queries. The top-level FOR XML clause can be used only in the SELECT statement. In sub queries, FOR XML can be used in the INSERT, UPDATE, and DELETE statements. It can also be used in assignment statements. For example:

DECLARE @x xml
SET @x = (SELECT *
FROM Sales.Customer
FOR XML AUTO, TYPE)
SELECT @x

Note that the TYPE directive returns the query result as xml type. If the type directive is not added, the FOR XML query result is returned as nvarchar(max). This is then converted to the xml and assigned to the xml type variable.

The FOR XML clause is subject to these limitations:

  • FOR XML is not valid for any selection that is used with a COMPUTE BY or FOR BROWSE clause. For example, the following will result in an error:

    SELECT TOP 5 SalesOrderID, UnitPrice 
    FROM Sales.SalesOrderDetail 
    ORDER BY SalesOrderID COMPUTE SUM(UnitPrice) BY SalesOrderID
    FOR XML AUTO
    
  • Top-level FOR XML without the TYPE directive cannot be used with cursors.

  • When a SELECT statement with a FOR XML clause specifies a four-part name in the query, the server name is not returned in the resulting XML document when the query is executed on the local computer. However, the server name is returned as the four-part name when the query is executed on a network server.
    For example, consider this query:

    SELECT TOP 1 LastName
    FROM ServerName.AdventureWorks.Person.Contact
    FOR XML AUTO
    

    When ServerName is a local server, the query returns the following:

    <AdventureWorks.Person.Contact LastName="Achong" />
    

    When ServerName is a network server, the query returns the following:

    <ServerName.AdventureWorks.Person.Contact LastName="Achong" />
    

    This potential ambiguity can be avoided by specifying this alias:

    SELECT TOP 1 LastName
    FROM ServerName.AdventureWorks.Person.Contact x
    FOR XML AUTO 
    

    This query returns the following:

    <x LastName="Achong"/>
    

Also, SQL Server names that contain characters that are invalid in XML names, such as spaces, are translated into XML names in a way in which the invalid characters are translated into escaped numeric entity encoding.

There are only two non-alphabetic characters that can occur within an XML name: the colon (:) and the underscore (_). Because the colon is already reserved for namespaces, the underscore is chosen as the escape character. Following are the escape rules that are used for encoding:

  • Any UCS-2 character that is not a valid XML name character, according to the XML 1.0 specification, is escaped as _xHHHH_. The HHHH stands for the four-digit hexadecimal UCS-2 code for the character in the most significant bit-first order. For example, the table name Order Details is encoded as Order_x0020_Details.

  • Characters that do not fit into the UCS-2 realm (the UCS-4 additions of the range U+00010000 to U+0010FFFF) are encoded as _xHHHHHHHH_. The HHHHHHHH stands for the eight-digit hexadecimal UCS-4 encoding of the character, if under SQL Server 2000 backward compatibility mode. Otherwise, the characters are encoded as_xHHHHHH_, in order to align with the SQL-2003 standard.

  • The underscore character does not have to be escaped unless it is followed by the character x. For example, the table name Order_Details is not encoded.

  • The colon in identifiers is not escaped As a result, the namespace element and attribute names can be generated by the FOR XML query. For example, the following query generates a namespace attribute that has a colon in the name:

    SELECT 'namespace-urn' as 'xmlns:namespace', 
     1 as 'namespace:a' 
    FOR XML RAW
    

    The query produces this result:

    <row xmlns:namespace="namespace-urn" namespace:a="1"/>
    

    Note that WITH XMLNAMESPACES is the recommended way to add XML namespaces.

  • In a SELECT query, casting any column to a binary large object (BLOB) makes it a temporary entity in that it loses its associated table name and column name. This causes AUTO mode queries to generate an error, because it does not know where to put this value in the XML hierarchy. For example:

    CREATE TABLE MyTable (Col1 int PRIMARY KEY, Col2 binary)
    INSERT INTO MyTable VALUES (1, 0x7)
    

    This query produces an error, because of the casting to a binary large object (BLOB):

    SELECT Col1,
    CAST(Col2 as image) as Col2
    FROM MyTable
    FOR XML AUTO
    

    The solution is to add the BINARY BASE64 option to the FOR XML clause. If you remove the casting, the query produces the results as expected:

    SELECT Col1,
    Col2
    FROM MyTable
    FOR XML AUTO
    

    This is the result:

    <MyTable Col1="1" Col2="dbobject/MyTable[@Col1='1']/@Col2" />
    
  • In SQL Server 2000, FOR XML output can contain invalid XML characters. For example, the hexadecimal value 7 is used as a format character, but is otherwise not typically viewable as text in the output. SQL Server 2005 now entitizes such characters when they are returned within FOR XML queries that do not use the TYPE directive.
    Although XML 1.0 conformant parsers raise parse errors regardless of whether these characters are entitized or not, the entitized form is better aligned with XML 1.1. The entitized form is also potentially better aligned with future versions of the XML standard. Additionally, it makes debugging simpler, because the code point of the invalid character becomes visible.
    For users of XML tools, no workaround is required, because the XML parser will fail either way at the point where the invalid characters occur in the data stream. If you use non-XML tools, this change can require you to update your programming logic to search for these characters as entitized values.

  • In SQL Server 2005, the following white space characters are now entitized differently in FOR XML queries to preserve their presence through round-tripping:

    • In element content and attributes: hex(0D) (carriage return)
    • In attribute content: hex(09) (tab), hex(0A) (line feed)

    This might affect your application if it was designed originally to handle SQL Server 2000 output in which these characters were normalized. SQL Server 2005 output will now preserve these characters, and a parser will no longer normalize them.

See Also

Reference

Constructing XML Using FOR XML

Other Resources

SELECT (Transact-SQL)

Help and Information

Getting SQL Server 2005 Assistance