With data processing, in performing ETL workloads to a data warehouse, some of the activities in ETL can be achieved using the MERGE statement in SQL.
The MERGE statement in SQL is a special type of query in SQL Server that is capable to handle inserts, updates, and deletes at the same time. It is sometimes necessary during developing ETL programs that we need to check for the data that already exists in the database or not and then perform an update or insert based on the result returned. In such cases, you can leverage the MERGE statement as it is easier to write the code and also provides enhanced readability.
The above figure illustrates how the MERGE statement works. It is basically a comparison of set operation performed between the source and the destination tables. The MERGE statement can perform the above three checks.
- Not matched by Source – Records found in the destination but not in source.
- Matched Rows – Records found in both source and destination.
- Not matched by Destination – Records found in the source but not in the destination.
Hands-On Demonstration
I think doing some hands-on work with the MERGE statement will explain how it actually works and you will be able to do it yourself. I will create some basic tables and insert data into those. We will try to move data from the source tables and target tables using the MERGE statement.
CREATE TABLE SourceTable(
ProductID INT,
ProductName VARCHAR(50),
Price DECIMAL(9,2)
)
GO
INSERT INTO SourceTable(ProductID,ProductName, Price) VALUES(1,'Car',100)
INSERT INTO SourceTable(ProductID,ProductName, Price) VALUES(2,'Pen',80)
INSERT INTO SourceTable(ProductID,ProductName, Price) VALUES(3,'Mobile',50)
INSERT INTO SourceTable(ProductID,ProductName, Price) VALUES(4,'Laptop',300)
GO
CREATE TABLE TargetTable(
ProductID INT,
ProductName VARCHAR(50),
Price DECIMAL(9,2)
)
GO
INSERT INTO TargetTable(ProductID,ProductName, Price) VALUES(1,'Car',100)
INSERT INTO TargetTable(ProductID,ProductName, Price) VALUES(2,'Pen',180)
INSERT INTO TargetTable(ProductID,ProductName, Price) VALUES(5,'Washing Machine',50)
INSERT INTO TargetTable(ProductID,ProductName, Price) VALUES(6,'Keyboard',300)
GO
SELECT * FROM SourceTable
SELECT * FROM TargetTable
We can see the results from this code below.
Now, we have inserted data into the source and target tables, so let us write the MERGE query to synchronize the data between both the tables.
MERGE TargetTable AS Target
USING SourceTable AS Source
ON Source.ProductID = Target.ProductID
WHEN NOT MATCHED BY Target THEN
INSERT (ProductID,ProductName, Price)
VALUES (Source.ProductID,Source.ProductName, Source.Price);
It is evident from the below figure that upon executing the MERGE statement, the records with ProductID 3 and 4 have been inserted into the target table since they were not present. We have leveraged the condition when records are not matched by the target to perform an insert.
Now, let us extend the above functionality and write our code to perform updates on the target table when the records match between both the source and the target tables.
MERGE TargetTable AS Target
USING SourceTable AS Source
ON Source.ProductID = Target.ProductID
-- For Inserts
WHEN NOT MATCHED BY Target THEN
INSERT (ProductID,ProductName, Price)
VALUES (Source.ProductID,Source.ProductName, Source.Price)
-- For Updates
WHEN MATCHED THEN UPDATE SET
Target.ProductName = Source.ProductName,
Target.Price = Source.Price;
In the image below, we see the record with ProductID 2, which had a price of 180.00, has been updated to 80.00. This value is obtained from the source table and performed as the ProductID between both the tables match with each other.
Finally, let us also include the code to perform actions when records are not matched by the source table.
MERGE TargetTable AS Target
USING SourceTable AS Source
ON Source.ProductID = Target.ProductID
-- For Inserts
WHEN NOT MATCHED BY Target THEN
INSERT (ProductID,ProductName, Price)
VALUES (Source.ProductID,Source.ProductName, Source.Price)
-- For Updates
WHEN MATCHED THEN UPDATE SET
Target.ProductName = Source.ProductName,
Target.Price = Source.Price
-- For Deletes
WHEN NOT MATCHED BY Source THEN
DELETE;
I think with this explanation, you have a good bit of understanding about how the SQL MERGE statement works and get started writing your own queries with it.
Original post can be found at https://bit.ly/3ahnOJM