Managing Data Conversion Between Unicode Encoding Schemes

This topic describes how to preserve the integrity of character data when both server-side data storage and the client application that interacts with the data are Unicode-enabled, but use different Unicode encoding schemes. SQL Server stores Unicode in the UCS-2 encoding scheme. However, many clients process Unicode in another encoding scheme, generally UTF-8. This scenario frequently occurs for Web-based applications.

Because you are essentially still converting from one encoding scheme to another; many of the same solutions discussed in the topics Managing Data Conversion Between a Unicode Server and a Non-Unicode Client and Managing Data Conversion Between Client/Server Code Pages also apply. Unicode character string constants sent to the server must be preceded with a capital N. For Web-based applications, you specify the CHARSET code under the META attribute of the client-side HTML page. For example, specify CHARSET = utf-8 if the Unicode encoding scheme is UTF-8. On the server side, specify the encoding scheme of the client by using the Session.CodePage property or the @Codepage directive. For example, codepage=65001 specifies a UTF-8 encoding scheme. If you follow these directions, Internet Information Services (IIS) 5.0 or later versions will seamlessly handle the conversion from UTF-8 to UCS-2 and back without additional effort on your part.

In Visual Basic applications, character strings are processed in the UCS-2 encoding scheme. Therefore, you do not have to specify encoding scheme conversion explicitly between these applications and an instance of SQL Server.

See Also

Concepts

Client-Side Programming with Unicode
Programming Database Applications That Use Unicode

Help and Information

Getting SQL Server 2005 Assistance