Delete records that are considered duplicate, based on the same value in the column and keep the latest

I would like to delete records that are considered duplicates based on them having the same value in a particular column, and keep one of them, which is considered the newest based on InsertedDate in my example below. I would like a solution that does not use a cursor but is configured based on. Purpose: remove all duplicates and keep the latest.

Below ddl creates several duplicates. The entries to be deleted are John1 and John2 because they have the same identifier as John3, and John3 is the newest entry.

Also, the John 5 record must be deleted, because there is another record with ID = 3 and newer (John6).

Create table dbo.TestTable (ID int, InsertedDate DateTime, Name varchar(50)) Insert into dbo.TestTable Select 1, '07/01/2009', 'John1' Insert into dbo.TestTable Select 1, '07/02/2009', 'John2' Insert into dbo.TestTable Select 1, '07/03/2009', 'John3' Insert into dbo.TestTable Select 2, '07/03/2009', 'John4' Insert into dbo.TestTable Select 3, '07/05/2009', 'John5' Insert into dbo.TestTable Select 3, '07/06/2009', 'John6' 
+2
sql sql-server
source share
2 answers

It works:

 delete t from TestTable t left join ( select id, InsertedDate = max(InsertedDate) from TestTable group by id ) as sub on sub.id = t.id and sub.InsertedDate = t.InsertedDate where sub.id is null 

If you need to deal with ties, it gets a little harder.

+2
source share

As an academic exercise:

 with cte as ( select *, row_number() over (partition by ID order by InsertedDate desc) as rn from TestTable) delete from cte where rn <> 1; 

In most cases, the solution proposed by Sam is much better.

+4
source share

All Articles