Microsoft Sql Server Tutorials: Find duplicates in a table and delete them using SQL Server 2005

It is a very common requirement to find duplicates in table based on a column or set of columns and then delete them. Let us see how to do that:

Sample Table

DROP TABLE #Customers
GO

CREATE TABLE #Customers (CustID int, CustName varchar(10), PostalCode int)
INSERT #Customers SELECT 1,'A',100
INSERT #Customers SELECT 1,'B',100
INSERT #Customers SELECT 2,'C',100
INSERT #Customers SELECT 3,'D',100
INSERT #Customers SELECT 4,'E',100
INSERT #Customers SELECT 5,'F',100
INSERT #Customers SELECT 5,'G',100
INSERT #Customers SELECT 6,'H',200
INSERT #Customers SELECT 7,'I',300

Find Duplicates in CustID

SELECT CustID
FROM #Customers
GROUP BY CustID HAVING count(*) > 1

returns you the duplicate CustID

Delete Duplicates in the Customer table

SET ROWCOUNT 1
SELECT NULL
WHILE @@rowcount > 0
DELETE cust
FROM #Customers as cust
INNER JOIN
(SELECT CustID
FROM #Customers
GROUP BY CustID HAVING count(*) > 1)
AS c ON c.CustID = cust.CustID
SET ROWCOUNT 0

SELECT * FROM #Customers

2 comments:

AnonymousMay 6, 2008 at 1:20 PM
SQL Server 2005? Use this:

;with DelDup as (select row_number() over (partition by
CustID order by CustID) as RowNofrom duplicateTest)
Delete from DelDup where RowNo> 1
UnknownMay 6, 2008 at 1:23 PM
Forgot to change the name of the table in the example I just left...

;with DelDup as (select row_number() over (partition by
CustID order by CustID) as RowNofrom #Customers)
Delete from DelDup where RowNo> 1