Delete where one column contains duplicates

consider below:

ProductID Supplier --------- -------- 111 Microsoft 112 Microsoft 222 Apple Mac 222 Apple 223 Apple 

In this example, product 222 is repeated because the supplier is known as the two names in the data provided.

I have data similar to these for thousands of products. How can I remove duplicate products or select individual results - something like joining SELECT TOP 1 myself or something like that?

Thanks!

+4
source share
3 answers

I think you want to do the following:

 select t.* from (select t.*, row_number() over (partition by product_id order by (select NULL)) as seqnum from t ) t where seqnum = 1 

This selects an arbitrary row for each product.

To delete all lines except one, you can use the same idea:

 with todelete ( (select t.*, row_number() over (partition by product_id order by (select NULL)) as seqnum from t ) delete from to_delete where seqnum > 1 
+4
source
 DELETE a FROM tableName a LEFT JOIN ( SELECT Supplier, MIN(ProductID) min_ID FROM tableName GROUP BY Supplier ) b ON a.supplier = b.supplier AND a.ProductID = b.min_ID WHERE b.Supplier IS NULL 

or if you want to remove a productID that has more than an onbe-product

 WITH cte AS ( SELECT ProductID, Supplier, ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY Supplier) rn FROM tableName ) DELETE FROM cte WHERE rn > 1 
+4
source
 ;WITH Products_CTE AS ( SELECT ProductID, Supplier, ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY <some value>) as rn FROM PRODUCTS ) SELECT * FROM Products_CTE WHERE rn = 1 

some value will be the key that determines which version of the provider you save. If you need the first provider instance, you can use the DateAdded column if one exists.

+1
source

All Articles