Inserting, Updating, and Deleting Data by Using MERGE

In SQL Server 2008, you can perform insert, update, or delete operations in a single statement using the MERGE statement. The MERGE statement allows you to join a data source with a target table or view, and then perform multiple actions against the target based on the results of that join. For example, you can use the MERGE statement to perform the following operations:

  • Conditionally insert or update rows in a target table.

    If the row exists in the target table, update one or more columns; otherwise, insert the data into a new row.

  • Synchronize two tables.

    Insert, update, or delete rows in a target table based on differences with the source data.

The MERGE syntax consists of five primary clauses:

  • The MERGE clause specifies the table or view that is the target of the insert, update, or delete operations.

  • The USING clause specifies the data source being joined with the target.

  • The ON clause specifies the join conditions that determine where the target and source match.

  • The WHEN clauses (WHEN MATCHED, WHEN NOT MATCHED BY TARGET, and WHEN NOT MATCHED BY SOURCE) specify the actions to take based on the results of the ON clause and any additional search criteria specified in the WHEN clauses.

  • The OUTPUT clause returns a row for each row in the target that is inserted, updated, or deleted.

For complete details on the syntax and rules, see MERGE (Transact-SQL).

Specifying Source and Target Search Conditions

It is important to understand how the source and target data are merged into a single input stream and how additional search criteria can be used to correctly filter out unneeded rows. Otherwise,you might specify the additional search criteria in a way that produces incorrect results.

Rows in the source are matched with rows in the target based on the join predicate specified in the ON clause. The result is a combined input stream. One insert, update, or delete operation is performed per input row. Depending on the WHEN clauses specified in the statement, the input row might be any one of the following:

  • A matched pair consisting of one row from the target and one from the source. This is the result of the WHEN MATCHED clause.

  • A row from the source that has no corresponding row in the target. This is the result of the WHEN NOT MATCHED BY TARGET clause.

  • A row from the target that has no corresponding row in the source. This is the result of the WHEN NOT MATCHED BY SOURCE clause.

The combination of WHEN clauses specified in the MERGE statement determines the join type that is implemented by the query processor and affects the resulting input stream. To illustrate, consider the following example source and target tables and data.


    USE tempdb;
    GO
    CREATE TABLE dbo.Target(EmployeeID int, EmployeeName varchar(10), 
         CONSTRAINT Target_PK PRIMARY KEY(EmployeeID));
    CREATE TABLE dbo.Source(EmployeeID int, EmployeeName varchar(10), 
         CONSTRAINT Source_PK PRIMARY KEY(EmployeeID));
    GO
    INSERT dbo.Target(EmployeeID, EmployeeName) VALUES(100, 'Mary');
    INSERT dbo.Target(EmployeeID, EmployeeName) VALUES(101, 'Sara');
    INSERT dbo.Target(EmployeeID, EmployeeName) VALUES(102, 'Stefano');
    
    GO
    INSERT dbo.Source(EmployeeID, EmployeeName) Values(103, 'Bob');
    INSERT dbo.Source(EmployeeID, EmployeeName) Values(104, 'Steve');
    GO

The following table lists the possible join types and indicates when each type is implemented by the query optimizer. The table also shows the resulting input stream for the example source and target tables when the search criteria for matching the source and target data is Source.EmployeeID = Target.EmployeeID.

Join type

Implementation

Example input stream results

INNER JOIN

The WHEN MATCHED clause is the only specified WHEN clause.

TrgEmpID TrgName SrcEmpID SrcName

-------- ------- -------- -------

NULL     NULL    NULL     NULL

LEFT OUTER JOIN

The WHEN NOT MATCHED BY TARGET clause is specified but the WHEN NOT MATCHED BY SOURCE clause is not specified. The WHEN MATCHED might or might not be specified.

TrgEmpID TrgName SrcEmpID SrcName

-------- ------- -------- -------100      Mary    NULL     NULL

101      Sara    NULL     NULL

102      Stefano NULL     NULL

RIGHT OUTER JOIN

The WHEN MATCHED clause and the WHEN NOT MATCHED BY SOURCE clause are specified, but the WHEN NOT MATCHED BY TARGET clause is not specified.

TrgEmpID TrgName SrcEmpID SrcName

-------- ------- -------- -------NULL     NULL    103      Bob

NULL     NULL    104      Steve

FULL OUTER JOIN

The WHEN NOT MATCHED BY TARGET clause and the WHEN NOT MATCHED BY SOURCE clause are specified. The WHEN MATCHED might or might not be specified.

TrgEmpID TrgName SrcEmpID SrcName

-------- ------- -------- -------100      Mary    NULL     NULL

101      Sara    NULL     NULL

102      Stefano NULL     NULL

NULL     NULL    103      Bob

NULL     NULL    104      Steve

ANTI SEMI JOIN

The WHEN NOT MATCHED BY SOURCE clause is the only specified WHEN clause.

TrgEmpID TrgName

-------- -------

100      Mary

101      Sara

102      Stefano

The example input stream results show that the input stream results depend on the combination of WHEN clauses. Now assume that you want to perform the following actions on the target table based on that input stream:

  • Insert rows from the source table when the employee ID does not exist in the target table and the source employee name starts with 'S'.

  • Delete rows in the target table when the target employee name starts with 'S' and the employee ID does not exist in the source table.

To perform these actions, the following WHEN clauses are required:

  • WHEN NOT MATCHED BY TARGET THEN INSERT

  • WHEN NOT MATCHED BY SOURCE THEN DELETE

As described in the previous table, when both of the WHEN NOT MATCHED clauses are specified, the resulting input stream is a full outer join of the source and target tables. Now that the input stream results are known, consider how the insert, update, and delete actions will be applied to the input stream.

As stated earlier, the WHEN clauses specify the actions to take based on the results of the ON clause and any additional search criteria specified in the WHEN clauses. In many cases, the search conditions specified in the ON clause produces the required input stream. However, in the example scenario, the insert and delete actions require additional filtering to restrict the affected rows to those that have an employee name that starts with 'S'. In the following example, the filtering conditions are applied to the WHEN NOT MATCHED BY TARGET and the WHEN NOT MATCHED BY SOURCE. The output from the statement shows that the expected rows from the input stream are corrected, inserted, or deleted.

-- MERGE statement with the join conditions specified correctly.
USE tempdb;
GO
BEGIN TRAN;
MERGE Target AS T
USING Source AS S
ON (T.EmployeeID = S.EmployeeID) 
WHEN NOT MATCHED BY TARGET AND S.EmployeeName LIKE 'S%' 
    THEN INSERT(EmployeeID, EmployeeName) VALUES(S.EmployeeID, S.EmployeeName)
WHEN MATCHED 
    THEN UPDATE SET T.EmployeeName = S.EmployeeName
WHEN NOT MATCHED BY SOURCE AND T.EmployeeName LIKE 'S%'
    THEN DELETE 
OUTPUT $action, inserted.*, deleted.*;
ROLLBACK TRAN;
GO 

Here are the results of the OUTPUT clause.

$action   EmployeeID  EmployeeName EmployeeID  EmployeeName

--------- ----------- ------------ ----------- ------------

DELETE    NULL        NULL         101         Sara

DELETE    NULL        NULL         102         Stefano

INSERT    104         Steve        NULL        NULL

(3 row(s) affected)

Reducing the number of rows in the input stream early in the process by specifying the additional search condition to the ON clause (for example, by specifying ON Source.EmployeeID = Target.EmployeeID AND EmployeeName LIKE 'S%') might seem to improve query performance. However, doing so can cause unexpected and incorrect results. Because the additional search conditions specified in the ON clause are not used for matching the source and target data, they can be misapplied.

The following example demonstrates how incorrect results can occur. The search condition for matching the source and target tables and the additional search condition for filtering rows are both specified in the ON clause. Because the additional search condition is not required to determine source and target matching, the insert and delete actions are applied to all input rows. In effect, the filtering condition EmployeeName LIKE 'S%' is ignored. When the statement is run, the output of the inserted and deleted tables shows that two rows are incorrectly modified: Mary is incorrectly deleted from the target table, and Bob is incorrectly inserted.

    -- MERGE statement with join conditions that produce unexpected results.
    USE tempdb;
    GO
    BEGIN TRAN;
    MERGE Target AS T
    USING Source AS S
    ON (T.EmployeeID = S.EmployeeID AND T.EmployeeName LIKE 'S%' 
        AND S.EmployeeName LIKE 'S%' )
    WHEN NOT MATCHED BY TARGET
        THEN INSERT(EmployeeID, EmployeeName) VALUES(S.EmployeeID, S.EmployeeName)
    WHEN MATCHED 
        THEN UPDATE SET T.EmployeeName = S.EmployeeName
    WHEN NOT MATCHED BY SOURCE
        THEN DELETE
    OUTPUT $action, Inserted.*, Deleted.*;
    ROLLBACK TRAN;
    GO

Here are the results of the OUTPUT clause.

$action   EmployeeID  EmployeeName EmployeeID  EmployeeName

--------- ----------- ------------ ----------- ------------

DELETE    NULL        NULL         100         Mary

DELETE    NULL        NULL         101         Sara

DELETE    NULL        NULL         102         Stefano

INSERT    103         Bob          NULL        NULL

INSERT    104         Steve        NULL        NULL

(5 row(s) affected)

Search Condition Guidelines

The search conditions used to match the source and target rows and the additional search conditions used to filter rows from either the source or target must be specified correctly to ensure that correct results are obtained. We recommend following these guidelines:

  • Specify only search conditions in the ON <merge_search_condition> clause that determine the criteria for matching data in the source and target tables. That is, specify only columns from the target table that are compared to the corresponding columns of the source table.

  • Do not include comparisons to other values such as a constant.

To filter out rows from the source or target tables, use one of the following methods:

  • Specify the search condition for row filtering in the appropriate WHEN clause. For example, WHEN NOT MATCHED AND S.EmployeeName LIKE 'S%' THEN INSERT....

  • Define a view on the source or target that returns the filtered rows and reference the view as the source or target table. If the view is defined on the target table, any actions against it must satisfy the conditions for updating views. For more information about updating data by using a view, see Modifying Data Through a View.

  • Use the WITH <common table expression> clause to filter out rows from the source or target tables. This method is similar to specifying additional search criteria in the ON clause and may produce incorrect results. We recommend that you avoid using this method or test thoroughly before implementing it.

Examples

A. Using a simple MERGE statement to perform INSERT and UPDATE operations

Suppose you have a FactBuyingHabits table in a data warehouse database that tracks the last date each customer bought a specific product. A second table, Purchases, in an OLTP database records purchases during a given week. Each week, you want to add rows of products that specific customers never bought before from the Purchases table to the FactBuyingHabits table. For rows of customers buying products they have already bought before, you simply want to update the date of purchase in the FactBuyingHabits table. These insert and update operations can be performed in a single statement using MERGE.

The following example first creates tables Purchases and FactBuyingHabits and loads them with some sample data. Performance on MERGE statements improves when UNIQUE indexes are created on the join key, so clustered indexes are created by creating a PRIMARY KEY constraint on the ProductID column in both tables.

In this example, Purchases contains purchases for the week of August 21, 2006. FactBuyingHabits contains purchases for the prior week; ordinarily this table would be populated with rows dating back much earlier.


    USE AdventureWorks2008R2;
    GO
    IF OBJECT_ID (N'dbo.Purchases', N'U') IS NOT NULL 
        DROP TABLE dbo.Purchases;
    GO
    CREATE TABLE dbo.Purchases (
        ProductID int, CustomerID int, PurchaseDate datetime, 
        CONSTRAINT PK_PurchProdID PRIMARY KEY(ProductID,CustomerID));
    GO
    INSERT INTO dbo.Purchases VALUES(707, 11794, '20060821'),
    (707, 15160, '20060825'),(708, 18529, '20060821'),
    (711, 11794, '20060821'),(711, 19585, '20060822'),
    (712, 14680, '20060825'),(712, 21524, '20060825'),
    (712, 19072, '20060821'),(870, 15160, '20060823'),
    (870, 11927, '20060824'),(870, 18749, '20060825');
    GO
    IF OBJECT_ID (N'dbo.FactBuyingHabits', N'U') IS NOT NULL 
        DROP TABLE dbo.FactBuyingHabits;
    GO
    CREATE TABLE dbo.FactBuyingHabits (
        ProductID int, CustomerID int, LastPurchaseDate datetime, 
        CONSTRAINT PK_FactProdID PRIMARY KEY(ProductID,CustomerID));
    GO
    INSERT INTO dbo.FactBuyingHabits VALUES(707, 11794, '20060814'),
    (707, 18178, '20060818'),(864, 14114, '20060818'),
    (866, 13350, '20060818'),(866, 20201, '20060815'),
    (867, 20201, '20060814'),(869, 19893, '20060815'),
    (870, 17151, '20060818'),(870, 15160, '20060817'),
    (871, 21717, '20060817'),(871, 21163, '20060815'),
    (871, 13350, '20060815'),(873, 23381, '20060815');
    GO

The tables are now populated with the following data:

dbo.Purchases

ProductID   CustomerID  PurchaseDate

----------- ----------- -----------------------

707         11794       2006-08-20 00:00:00.000

707         15160       2006-08-25 00:00:00.000

708         18529       2006-08-21 00:00:00.000

711         11794       2006-08-20 00:00:00.000

711         19585       2006-08-22 00:00:00.000

712         14680       2006-08-26 00:00:00.000

712         21524       2006-08-26 00:00:00.000

712         19072       2006-08-20 00:00:00.000

870         15160       2006-08-23 00:00:00.000

870         11927       2006-08-24 00:00:00.000

870         18749       2006-08-25 00:00:00.000

dbo.FactBuyingHabits

ProductID   CustomerID  LastPurchaseDate

----------- ----------- -----------------------

707         11794       2006-08-14 00:00:00.000

707         18178       2006-08-18 00:00:00.000

864         14114       2006-08-18 00:00:00.000

866         13350       2006-08-18 00:00:00.000

866         20201       2006-08-15 00:00:00.000

867         20201       2006-08-14 00:00:00.000

869         19893       2006-08-15 00:00:00.000

870         17151       2006-08-18 00:00:00.000

870         15160       2006-08-17 00:00:00.000

871         21717       2006-08-17 00:00:00.000

871         21163       2006-08-15 00:00:00.000

871         13350       2006-08-15 00:00:00.000

873         23381       2006-08-15 00:00:00.000

Note that there are two Product-Customer rows common to both tables: Customer 11794 purchased Product 707 during the current week as well as the prior week, and the same is true of Customer 15160's purchase of Product 870. For those rows, we update FactBuyingHabits with the date recorded for those purchases in Purchases by using the WHEN MATCHED THEN clause. We insert all other rows into FactBuyingHabits by using the WHEN NOT MATCHED THEN clause.


MERGE dbo.FactBuyingHabits AS Target
USING (SELECT CustomerID, ProductID, PurchaseDate FROM dbo.Purchases) AS Source
ON (Target.ProductID = Source.ProductID AND Target.CustomerID = Source.CustomerID)
WHEN MATCHED THEN
    UPDATE SET Target.LastPurchaseDate = Source.PurchaseDate
WHEN NOT MATCHED BY TARGET THEN
    INSERT (CustomerID, ProductID, LastPurchaseDate)
    VALUES (Source.CustomerID, Source.ProductID, Source.PurchaseDate)
OUTPUT $action, Inserted.*, Deleted.*; 

B. Performing UPDATE and DELETE operations

The following example uses MERGE to update the ProductInventory table in the AdventureWorks2008R2 sample database daily, based on orders that are processed in the SalesOrderDetail table. Using the following MERGE statement, the Quantity column of the ProductInventory table is updated by subtracting the number of orders placed each day for each product. If the number of orders for a product causes the product's inventory to drop to 0 or below, the row for that product is deleted from the ProductInventory table. Note that the source table is aggregated on the ProductID column. If this were not done, more than one ProductID in the source table might match the target table and cause the MERGE statement to return an error.


    USE AdventureWorks2008R2;
    GO
    IF OBJECT_ID (N'Production.usp_UpdateInventory', N'P') IS NOT NULL DROP PROCEDURE Production.usp_UpdateInventory;
    GO
    CREATE PROCEDURE Production.usp_UpdateInventory
        @OrderDate datetime
    AS
    MERGE Production.ProductInventory AS target
    USING (SELECT ProductID, SUM(OrderQty) FROM Sales.SalesOrderDetail AS sod
        JOIN Sales.SalesOrderHeader AS soh
        ON sod.SalesOrderID = soh.SalesOrderID
        AND soh.OrderDate = @OrderDate
        GROUP BY ProductID) AS source (ProductID, OrderQty)
    ON (target.ProductID = source.ProductID)
    WHEN MATCHED AND target.Quantity - source.OrderQty <= 0
        THEN DELETE
    WHEN MATCHED 
        THEN UPDATE SET target.Quantity = target.Quantity - source.OrderQty, 
                        target.ModifiedDate = GETDATE()
    OUTPUT $action, Inserted.ProductID, Inserted.Quantity, Inserted.ModifiedDate, Deleted.ProductID,
        Deleted.Quantity, Deleted.ModifiedDate;
    GO
    
    EXECUTE Production.usp_UpdateInventory '20030501'

C. Performing INSERT, UPDATE, and DELETE operations

The following example uses MERGE to insert, update, or delete rows in a target table based on differences with the source data. Consider a small company with five departments, each with a department manager. The company decides to re-organize its departments. To implement the re-organization results in the target table dbo.Departments, the MERGE statement must implement the following changes:

  • Some existing departments will not change.

  • Some existing departments will have new managers.

  • Some departments are newly created.

  • Some departments will not exist after the reorganization.

The following code creates the target table dbo.Departments and populates it with managers.


    USE AdventureWorks2008R2;
    GO
    IF OBJECT_ID (N'dbo.Departments', N'U') IS NOT NULL 
        DROP TABLE dbo.Departments;
    GO
    CREATE TABLE dbo.Departments (DeptID tinyint NOT NULL PRIMARY KEY, DeptName nvarchar(30), 
        Manager nvarchar(50));
    GO
    INSERT INTO dbo.Departments 
        VALUES (1, 'Human Resources', 'Margheim'),(2, 'Sales', 'Byham'), 
               (3, 'Finance', 'Gill'),(4, 'Purchasing', 'Barber'),
               (5, 'Manufacturing', 'Brewer');

The organizational changes to be made to the departments are stored in the source table dbo.Departments_delta. The following code creates and populates this table:


    USE AdventureWorks2008R2;
    GO
    IF OBJECT_ID (N'dbo.Departments_delta', N'U') IS NOT NULL 
        DROP TABLE dbo.Departments_delta;
    GO
    CREATE TABLE dbo.Departments_delta (DeptID tinyint NOT NULL PRIMARY KEY, DeptName nvarchar(30), 
        Manager nvarchar(50));
    GO
    INSERT INTO dbo.Departments_delta VALUES 
        (1, 'Human Resources', 'Margheim'), (2, 'Sales', 'Erickson'),
        (3 , 'Accounting', 'Varkey'),(4, 'Purchasing', 'Barber'), 
        (6, 'Production', 'Jones'), (7, 'Customer Relations', 'Smith');
    GO

Finally, to reflect the company reorganization in the target table, the following code uses the MERGE statement to compare the source table, dbo.Departments_delta, with the target table dbo.Departments. The search condition for this comparison is defined in the ON clause of the statement. Based on the results of the comparison, the following actions are taken.

  • Departments that exist in both tables are updated in the target table with new names, new managers, or both in table Departments. If there are no changes, nothing is updated. This is accomplished in the WHEN MATCHED THEN clause.

  • Any departments in Departments_delta that don't exist in Departments are inserted into Departments. This is accomplished in the WHEN NOT MATCHED THEN clause.

  • Any departments in Departments that do not exist in the source table Departments_delta are deleted from Departments. This is accomplished in the WHEN NOT MATCHED BY SOURCE THEN clause.



MERGE dbo.Departments AS d
USING dbo.Departments_delta AS dd
ON (d.DeptID = dd.DeptID)
WHEN MATCHED AND d.Manager <> dd.Manager OR d.DeptName <> dd.DeptName
    THEN UPDATE SET d.Manager = dd.Manager, d.DeptName = dd.DeptName
WHEN NOT MATCHED THEN
    INSERT (DeptID, DeptName, Manager)
        VALUES (dd.DeptID, dd.DeptName, dd.Manager)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE
OUTPUT $action, 
       inserted.DeptID AS SourceDeptID, inserted.DeptName AS SourceDeptName, 
       inserted.Manager AS SourceManager, 
       deleted.DeptID AS TargetDeptID, deleted.DeptName AS TargetDeptName, 
       deleted.Manager AS TargetManager;