Performing Bulk Copy Operations
The SQL Server bulk copy feature supports the transfer of large amounts of data into or out of a SQL Server table or view. Data can also be transferred out by specifying a SELECT statement. The data can be moved between SQL Server and an operating-system data file, such as an ASCII file. The data file can have different formats; the format is defined to bulk copy in a format file. Optionally, data can be loaded into program variables and transferred to SQL Server using bulk copy functions and methods.
CodePlex has a sample application that demonstrates this feature; see SQL Server Database Engine Samples for more information.
An application typically uses bulk copy in one of the following ways:
Bulk copy from a table, view, or the result set of a Transact-SQL statement into a data file where the data is stored in the same format as the table or view.
This is called a native-mode data file.
Bulk copy from a table, view, or the result set of a Transact-SQL statement into a data file where the data is stored in a format other than the one of the table or view.
In this case, a separate format file is created that defines the characteristics (data type, position, length, terminator, and so on) of each column as it is stored in the data file. If all columns are converted to character format, the resulting file is called a character-mode data file.
Bulk copy from a data file into a table or view.
If needed, a format file is used to determine the layout of the data file.
Load data into program variables, then import the data into a table or view using the bulk copy functions for bulk copying in a row at a time.
Data files used by bulk copy functions do not have to be created by another bulk copy program. Any other system can generate a data file and format file according to bulk copy definitions; these files can then be used with a SQL Server bulk copy program to import data into SQL Server. For example, you could export data from a spreadsheet in a tab-delimited file, build a format file describing the tab-delimited file, and then use a bulk copy program to quickly import the data into SQL Server. Data files generated by bulk copy can also be imported into other applications. For example, you could use bulk copy functions to export data from a table or view into a tab-delimited file that could then be loaded into a spreadsheet.
Beginning with SQL Server 2005, when you use the bcp utility the server reports an error when numeric data truncation occurs. SQL Server 2000 and earlier releases only returned a warning. This may cause problems for existing applications that ignore the warning. These problems can be avoided by ensuring that the input data has correct values that will not be truncated, or by continuing to use the SQL Server 2000 version of bcp.
Programmers coding applications to use the bulk copy functions should follow the general rules for good bulk copy performance. For more information about support for bulk copy operations in SQL Server, see About Bulk Import and Bulk Export Operations.
A CLR user-defined type (UDT) must be bound as binary data. Even if a format file specifies SQLCHAR as the data type for a target UDT column, The BCP utility will treat the data as binary.
Do not use SET FMTONLY OFF with bulk copy operations. SET FMTONLY OFF may cause your bulk copy operation to fail or give unexpected results.
The SQL Server Native Client OLE DB provider implements two methods for performing bulk copy operations with a SQL Server database. The first method involves using the IRowsetFastLoad interface for memory-based bulk copy operations; and the second involves using the IBCPSession interface for file-based bulk copy operations.
Using Memory Based Bulk Copy Operations
The SQL Server Native Client OLE DB provider implements the IRowsetFastLoad interface to expose support for SQL Server memory-based bulk copy operations. The IRowsetFastLoad interface implements the IRowsetFastLoad::Commit and IRowsetFastLoad::InsertRow methods.
Enabling a Session for IRowsetFastLoad
The consumer notifies the SQL Server Native Client OLE DB provider of its need for bulk copy by setting the SQL Server Native Client OLE DB provider-specific data source property SSPROP_ENABLEFASTLOAD to VARIANT_TRUE. With the property set on the data source, the consumer creates a SQL Server Native Client OLE DB provider session. The new session allows consumer access to the IRowsetFastLoad interface.
If the IDataInitialize interface is used for initializing the data source, then it is necessary to set the SSPROP_IRowsetFastLoad property in the rgPropertySets parameter of the IOpenRowset::OpenRowset method; otherwise, the call to the OpenRowset method will return E_NOINTERFACE.
Enabling a session for bulk copy constrains the SQL Server Native Client OLE DB provider support for interfaces on the session. A bulk copy-enabled session exposes only the following interfaces:
To disable the creation of bulk copy-enabled rowsets and cause the SQL Server Native Client OLE DB provider session to revert to standard processing, reset SSPROP_ENABLEFASTLOAD to VARIANT_FALSE.
The SQL Server Native Client OLE DB provider bulk copy rowsets are write-only, but they expose interfaces that allow the consumer to determine the structure of a SQL Server table. The following interfaces are exposed on a bulk copy-enabled SQL Server Native Client OLE DB provider rowset:
The provider-specific properties SSPROP_FASTLOADOPTIONS, SSPROP_FASTLOADKEEPNULLS, and SSPROP_FASTLOADKEEPIDENTITY control behaviors of a SQL Server Native Client OLE DB provider bulk-copy rowset. The properties are specified in the rgProperties member of an rgPropertySets IOpenRowset parameter member.
Description: Maintains identity values supplied by the consumer.
VARIANT_FALSE: Values for an identity column in the SQL Server table are generated by SQL Server. Any value bound for the column is ignored by the SQL Server Native Client OLE DB provider.
VARIANT_TRUE: The consumer binds an accessor providing a value for a SQL Server identity column. The identity property is not available on columns accepting NULL, so the consumer provides a unique value on each IRowsetFastLoad::Insert call.
Description: Maintains NULL for columns with a DEFAULT constraint. Affects only SQL Server columns that accept NULL and have a DEFAULT constraint applied.
VARIANT_FALSE: SQL Server inserts the default value for the column when the SQL Server Native Client OLE DB provider consumer inserts a row containing NULL for the column.
VARIANT_TRUE: SQL Server inserts NULL for the column value when the SQL Server Native Client OLE DB provider consumer inserts a row containing NULL for the column.
Description: This property is the same as the -h "hint[,...n]" option of the bcp utility. The following string(s) can be used as option(s) in the bulk copying of data into a table.
ORDER(column[ASC | DESC][,...n]): Sort order of data in the data file. Bulk copy performance is improved if the data file being loaded is sorted according to the clustered index on the table.
ROWS_PER_BATCH = bb: Number of rows of data per batch (as bb). The server optimizes the bulk load according to the value bb. By default, ROWS_PER_BATCH is unknown.
KILOBYTES_PER_BATCH = cc: Number of kilobytes (KB) of data per batch (as cc). By default, KILOBYTES_PER_BATCH is unknown.
TABLOCK: A table-level lock is acquired for the duration of the bulk copy operation. This option significantly improves performance because holding a lock only for the duration of the bulk copy operation reduces lock contention on the table. A table can be loaded by multiple clients concurrently if the table has no indexes and TABLOCK is specified. By default, the locking behavior is determined by the table option table lock on bulk load.
CHECK_CONSTRAINTS: Any constraints on table_name are checked during the bulk copy operation. By default, constraints are ignored.
FIRE_TRIGGER: In SQL Server 2000, with triggers enabled, optimized logging was not possible because the trigger logic was based on log records. During a bulk import operation with triggers enabled, all bulk logging optimizations (including BU locks) were disabled.
Beginning with SQL Server 2005, however, SQL Server uses row versioning for triggers and stores the row versions in the version store in tempdb. Therefore, bulk logging optimizations are available even when triggers are enabled. Before bulk importing a batch with a large number of rows with triggers enabled, you may need to expand the size of tempdb.
Using File Based Bulk Copy Operations
The SQL Server Native Client OLE DB provider implements the IBCPSession interface to expose support for SQL Server file-based bulk copy operations. The IBCPSession interface implements the IBCPSession::BCPColFmt, IBCPSession::BCPColumns, IBCPSession::BCPControl, IBCPSession::BCPDone, IBCPSession::BCPExec, IBCPSession::BCPInit, IBCPSession::BCPReadFmt, and IBCPSession::BCPWriteFmtmethods.
The SQL Server Native Client ODBC driver maintains the same support for bulk copy operations that were part of previous versions of the SQL Server ODBC driver. For information about bulk copy operations using the SQL Server Native Client ODBC driver, see Performing Bulk Copy Operations (ODBC).