How can I combine two redundant records in a MySQL table while maintaining all PK / FK relationships?

Let's say I have a customers table with the following fields and records:

 id first_name last_name email phone ------------------------------------------------------------------------ 1 Michael Turley mturley@whatever.com 555-123-4567 2 John Dohe jdoe@whatever.com 3 Jack Smith jsmith@whatever.com 555-555-5555 4 Johnathan Doe 123-456-7890 

There are several other tables, such as orders , rewards , receipts , which have customer_id foreign keys related to this customers.id table.

As you can see, in their infinite wisdom, my users created duplicate entries for John Doe, complete with inconsistent spelling and missing data. The administrator notices this, selects clients 2 and 4 and clicks "Merge". Then they are asked to choose which value is suitable for each field, etc., And my PHP determines that the combined record should look like this:

 id first_name last_name email phone ------------------------------------------------------------------------ ? John Doe jdoe@whatever.com 123-456-7890 

Suppose that Mr. Dow placed several orders, received awards, generated receipts .. but some of them were associated with id 2, and some of them were associated with id 4. The combined row should correspond to all foreign keys in other tables, which match the source lines.

Here, where I'm not sure what to do. My instinct should do this:

 DELETE FROM customers WHERE id = 4; UPDATE customers SET first_name = 'John', last_name = 'Doe', email = ' jdoe@whatever.com ', phone = '123-456-7890' WHERE id = 2; UPDATE orders, rewards, receipts SET customer_id = 2 WHERE customer_id = 4; 

I think this will work, but if I add another table with the customer_id foreign key later, I must remember that I go back and add this table to the second UPDATE query in my merge function or risk losing integrity.

There must be a better way to do this.

+7
source share
4 answers

As an update to my comment:

 use information_schema; select table_name from columns where column_name = 'customer_id'; 

Then loop the resulting tables and update accordingly.

Personally, I would use your instinctive solution, as it can be dangerous if there are tables containing customer_id columns that need to be freed.

0
source

I got google form, these are my 2 cents:

 SELECT `TABLE_NAME` FROM `information_schema`.`KEY_COLUMN_USAGE` WHERE REFERENCED_TABLE_SCHEMA='DATABASE' AND REFERENCED_TABLE_NAME='customers' AND REFERENCED_COLUMN_NAME='customer_id' 

add db for insurance (you will never know when someone copies db).

Instead of looking for the column name, we look at the foreign keys themselves

If you change the deletion restrictions for the restriction, then nothing can be deleted before the children are deleted / transferred.

+5
source

Short answer: there is no better way (which I can think of).

This is a compromise. If you find that there are many such instances, it may be worthwhile to spend some time creating a more reliable algorithm for checking existing clients before adding a new one (for example, checking variations on the first / last name, introducing them to the person who adds the client, asking them 2 or 3 times if they REALLY are sure that they want to add this new client, etc.). If there are not many such examples, you may not want to invest in this time.

Also, your approach is the only way I can think of. I would actually delete both records and create a new one with the combined data, resulting in a new client identifier, rather than reusing the old one, but this is just a personal preference - functionally this is the same as your approach. You still need to remember to go back and change the merge function to reflect the new relationship in the customer.id field.

+2
source

At a minimum, to prevent any triggers on deletion causing a cascading effect, I would do FIRST do

update SomeTable set CustomerID = CorrectValue, where CustomerID = WrongValue

(do this in all tables) ...

THEN Remove from customers where CustomerID = WrongValue

As for the duplicate data ... Try to find out who “Will Smith, Bill Smith, William Smith" is if you lack certain information ... Some may be completely legitimate different people.

+1
source

All Articles