Delete duplicates in Access 2003

Question

Delete duplicates in Access 2003

I have an Access 2003 table with ~ 4000 records, which was made from 17 different tables. About half of these records are duplicates. There is no unique identification column (id, name, etc.). There is an identifier column that was automatically populated when the tables were joined, which means that duplicates are not completely identical (although this column can be removed if this is easier).

I used the Access Duplicates Query Wizard, a query wizard that gives me a list of duplicate records, but doesn't allow me to delete them (seriously, what's the point of using this query if I can't delete them?). I tried to convert the generated request into a delete request, but this will change the number of rows found. I would change sql manually, but it is slightly higher than me and has a length of 7 lines.

Does anyone know a good way to get rid of duplicates?

+6

duplicates ms-access

Mr_Chimp Oct 22 '09 at 12:22

source share

5 answers

Use select with all columns except the ID column:

SELECT DISTINCTROW Column1, Column2, Column3 INTO MYNEWTABLE FROM TABLE

You can simply replace the names.

This solution will give you a new table without duplicates.

+1

Raj more Oct 22 '09 at 12:42

source share

The following will save the original identifiers and do it in one step:

 DELETE FROM table_with_duplicates WHERE table_with_duplicates.id NOT IN (SELECT max(id) FROM table_with_duplicates GROUP BY duplicated_field_1, duplicated_field_2, ... )

Now you have the original table without duplicates and stored identifiers. And always remember to back up data before trying a big DELETE.

+1

avguchenko Oct 22 '09 at 15:28

source share

 DELETE * FROM table_with_duplicates WHERE table_with_duplicates.ID In (SELECT max(ID) FROM table_with_duplicates GROUP BY [duplicated_field_1] HAVING Count(*)>1 )

+1

Tom Mar 23 '12 at 13:58

source share

I actually found one. A very simple solution took some time, but all your fields are the same as a complete duplicate record, and then just make one request with each field and sort by "Group BY". This way, the duplicates will be merged, and you can simply add this information to the new table and rename it the same as the existing table. If you have a primary key field, you can simply ignore it in the request, and then it will still combine the data (provided that you do not need the data in the main field). I don’t know why no one mentioned this decision, it took me 5 hours. Come up. :)

0

Coldteck Oct 31 '13 at 23:44

source share

ดาว · Accepted Answer · 2009-10-22T12:54:52+0000

The reason that the duplicate search query will not allow you to delete records is because it is basically just a cumulative query, it counts the number of duplicates found and returns cases when the counter is greater than 1.

Keep in mind that if you made a deletion request based on duplicate searches, it will delete all rows that have duplicate values, which may not be what you want. You want to delete all but one of the duplicates.

You should try to delete all duplicate entries except one, with the exception of the ID column in your comparison. I suggest the simplest way to do this is to make a make-table query for all unique values (select Distinct Field1, Field2 ... from MyTable) instead for each field except for the ID field, using the results in a to create a new table from about 2000 records (if half of the duplicates).

Then create an identifier column in your new table, use an update request to update that identifier to the first matching identifier in the original table (you can do this with DLookup , which will return the first EXPRESSION value, where CRITERIA is true in DOMAIN).

The DLookup () function returns a single value from a single field, even if more than one record meets the criteria. If no entry meets the criteria, or if there are no entries in the domain, DLookup () returns Null.

Because you identify the first comparable identifier based on all other fields that are unique values, unsurpassed identifiers will belong to duplicates. You will change the PK relation by identifying the first match key defined by a set of unique fields. After that, you should set the PK identifier. Of course, this assumes that the identifier does not have an inherent meaning, and you do not need to save one specific identifier for a given duplicate row by any of the identifiers belonging to other duplicate rows. This assumes that you take care of the data in the ID column to save it for all remaining rows, otherwise just ignore the DLookup step and select "Select Separate" on all columns except the identifier.

Delete duplicates in Access 2003

More articles: