Managing Data Conversion Between Unicode Encoding Schemes

Article
12/03/2008

This topic describes how to preserve the integrity of character data when both server-side data storage and the client application that interacts with the data are Unicode-enabled, but use different Unicode encoding schemes. SQL Server stores Unicode in the UCS-2 encoding scheme. However, many clients process Unicode in another encoding scheme, generally UTF-8. This scenario frequently occurs for Web-based applications.

Because you are essentially still converting from one encoding scheme to another; many of the same solutions discussed in the topics Managing Data Conversion Between a Unicode Server and a Non-Unicode Client and Managing Data Conversion Between Client/Server Code Pages also apply. Unicode character string constants sent to the server must be preceded with a capital N. For Web-based applications, you specify the CHARSET code under the META attribute of the client-side HTML page. For example, specify CHARSET = utf-8 if the Unicode encoding scheme is UTF-8. On the server side, specify the encoding scheme of the client by using the Session.CodePage property or the @Codepage directive. For example, codepage=65001 specifies a UTF-8 encoding scheme. If you follow these directions, Internet Information Services (IIS) 5.0 or later versions will seamlessly handle the conversion from UTF-8 to UCS-2 and back without additional effort on your part.

In Visual Basic applications, character strings are processed in the UCS-2 encoding scheme. Therefore, you do not have to specify encoding scheme conversion explicitly between these applications and an instance of SQL Server.

Managing Data Conversion Between Unicode Encoding Schemes

See Also

Concepts

Help and Information

Additional resources