The most efficient way to remove all duplicate rows from a table?

I have a table:

| foo | bar | +-----+-----+ | a | abc | | b | def | | c | ghi | | d | jkl | | a | mno | | e | pqr | | c | stu | | f | vwx | 

I want to remove all rows containing duplicates with the foo column so that the table looks like this:

 | foo | bar | +-----+-----+ | b | def | | d | jkl | | e | pqr | | f | vwx | 

What is the most efficient way to do this?

+7
source share
2 answers

You can join the table from a subquery that returns only a unique foo using LEFT JOIN . Rows that do not have a match in the subquery will be deleted as you wish, for example

 DELETE a FROM TableName a LEFT JOIN ( SELECT foo FROM TableName GROUP BY Foo HAVING COUNT(*) = 1 ) b ON a.Foo = b.Foo WHERE b.Foo IS NULL 

For better performance, add an index to the foo column.

 ALTER TABLE tableName ADD INDEX(foo) 
+9
source

Using EXISTS :

 DELETE a FROM TableName a WHERE EXISTS (SELECT NULL FROM TableName b WHERE b.foo = a.foo GROUP BY b.foo HAVING COUNT(*) > 1) 

Using IN :

 DELETE a FROM TableName a WHERE a.foo IN (SELECT b.foo FROM TableName b GROUP BY b.foo HAVING COUNT(*) > 1) 
+8
source

All Articles