SQL: removing duplicate records in SQL Server

Question

SQL: removing duplicate records in SQL Server

I have a sql server database that I preloaded tons of data rows.

Unfortunately, the primary key in the database is missing, and the table now has duplicate information. I don't care that there is no primary key, but I am worried about the presence of duplicates in the database ...

Any thoughts? (Forgive me for being sql serverbb server)

+6

sql sql-server sql-server-2008 sql-server-2008-express

rockit Nov 20 '09 at 19:00

source share

4 answers

take a look at that.

“It's easy to delete data that is duplicated in all columns of the table. What’s harder to do is delete the data that you think is duplicate based on your business rules, while SQL Server considers the data to be unique”

http://www.sql-server-performance.com/articles/dba/delete_duplicates_p1.aspx

0

Henry gao Nov 20 '09 at 19:07

source share

Let's say your table is unique to COL1 and COL2.
Here's how to do it:

 SELECT * FROM (SELECT COL1, COL2, ROW_NUMBER() OVER (PARTITION BY COL1, COL2 ORDER BY COL1, COL2 ASC) AS ROWID FROM TABLE_NAME )T WHERE T.ROWID > 1

ROWID> 1 will allow you to select only duplicate rows.

0

Danielle Paquette-Harvey Nov 20 '09 at 19:08

source share

This article about Removing duplicate records in SQL Server for a table without a primary key can help resolve this problem.

0

logiclabz Nov 23 '09 at 5:27

source share

Aaron bertrand · Accepted Answer · 2009-11-20T19:04:22+0000

Well, this is one of the reasons why you should have a primary key in the table. What version of SQL Server? For SQL Server 2005 and later:

;WITH r AS ( SELECT col1, col2, col3, -- whatever columns make a "unique" row rn = ROW_NUMBER() OVER (PARTITION BY col1, col2, col3 ORDER BY col1) FROM dbo.SomeTable ) DELETE r WHERE rn > 1;

Then you do not need to do this tomorrow, and the next day and the next day, declare the primary key in the table.

SQL: removing duplicate records in SQL Server

More articles: