As user1045047 mentioned, Amazon Redshift does not support unique restrictions, so I was looking for a way to remove duplicate records from a table using the delete statement. Finally, I found a reasonable way.
Amazon Redshift supports the creation of an IDENTITY column that stores a unique unique number. http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_NEW.html
The following sql for PostgreSQL is to remove duplicate records with an OID that is a unique column, and you can use this sql by replacing the OID with an identity column.
DELETE FROM duplicated_table WHERE OID > ( SELECT MIN(OID) FROM duplicated_table d2 WHERE column1 = d2.dupl_column1 AND column2 = d2.column2 );
Here is an example that I tested on my Amazon Redshift cluster.
create table auto_id_table (auto_id int IDENTITY, name varchar, age int); insert into auto_id_table (name, age) values('John', 18); insert into auto_id_table (name, age) values('John', 18); insert into auto_id_table (name, age) values('John', 18); insert into auto_id_table (name, age) values('John', 18); insert into auto_id_table (name, age) values('John', 18); insert into auto_id_table (name, age) values('Bob', 20); insert into auto_id_table (name, age) values('Bob', 20); insert into auto_id_table (name, age) values('Matt', 24); select * from auto_id_table order by auto_id; auto_id | name | age
He also works with the COPY team as follows.
The advantage of this method is that you do not need to run DDL statements. However, it does not work with existing tables that do not have an identifier column, because the identifier column cannot be added to an existing table. The only way to delete duplicate records with existing tables is to migrate all records like this. (same as user1045047 answer)
insert into temp_table (select distinct from original_table); drop table original_table; alter table temp_table rename to original_table;