Deleting records is a very common activity performed against databases and everyone is familiar with DELETE statement. Here is a question based on DELETE statement.
What is the easiest way of deleting duplicated records from a table if you can identify the duplicated records by a column?
There can be multiple ways of doing it. Before seeing the solution, let's make a table with duplicates.
Here are two ways of deleting records, both work fine, however first method might give better performance over a large dataset.
What is the easiest way of deleting duplicated records from a table if you can identify the duplicated records by a column?
There can be multiple ways of doing it. Before seeing the solution, let's make a table with duplicates.
IF OBJECT_ID('dbo.ProductCode') IS NOT NULL DROP TABLE dbo.ProductCode; GO CREATE TABLE dbo.ProductCode ( ProductCode char(5) , ProductType char(5) ); GO INSERT INTO dbo.ProductCode VALUES ('PR001', 'T0001') , ('PR001', 'T0001') , ('PR002', 'T0002') , ('PR003', 'T0003') , ('PR001', 'T0001') , ('PR004', 'T0004') , ('PR003', 'T0003')
Here are two ways of deleting records, both work fine, however first method might give better performance over a large dataset.
-- Here is the way of deleting duplicates WITH cte AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY ProductCode ORDER BY ProductCode) AS RowNumber FROM dbo.ProductCode) DELETE FROM cte WHERE RowNumber > 1 ; -- Checking records SELECT * FROM dbo.ProductCode ORDER BY ProductCode;
-- Here is another way of deleting them SET ROWCOUNT 1 WHILE 1=1 BEGIN DELETE FROM dbo.ProductCode WHERE ProductCode IN (SELECT ProductCode FROM dbo.ProductCode GROUP BY ProductCode HAVING COUNT(*) > 1) IF @@ROWCOUNT = 0 BREAK; END SET ROWCOUNT 0 -- Checking records SELECT * FROM dbo.ProductCode ORDER BY ProductCode;
No comments:
Post a Comment