with cte as ( select row_number() over (partition by dupcol1, dupcol2 order by ID) as rn from table) delete from cte where rn > 2;
The query produces a "line number" for each record, grouped by (dupcol1, dupcol2) and ordered by identifier. In fact, this line number counts "duplicates" that have the same dupcol1 and dupcol2 and assign, and then the number 1, 2, 3 .. N, the order by ID. If you want to keep only 2 duplicates, you need to delete those that were assigned numbers 3,4,.. N , and this is the part that will be taken care of DELLETE.. WHERE rn > 2;
Using this method, you can change the ORDER BY according to your preferred order (for example, ORDER BY ID DESC ), so that LATEST has rn=1 , and then the last one has rn = 2, etc. The rest remain the same, DELETE will only delete the oldest of them, since they have the highest line numbers.
In contrast to this closely related question , as the condition becomes more complex, the use of CTE and row_number () is simplified. Performance can be problematic if there is no appropriate access index.
Remus Rusanu
source share