One note: ON DELETE CASCADE performs poor operations with bulk operations. The reason is that this is done as a trigger. Therefore, how it looks from the algorithmic point of view:
for row in delete_set: for dependent row in (scan for referencing rows): delete dependent row
If you delete 800,000 rows in the parent table, this translates to 800,000 individual delete checks on the dependent tables. Even in the best case, using 800,000 indexes, individual index scans will be much slower than a single sequential scan.
The best way to do this is to use a standard table expression to write to 9.1 or later, or just make separate delete instructions in the same transaction. Sort of:
WITH rows_to_delete (id) AS ( SELECT id FROM mytable WHERE where_condition ), deleted_rows (id) AS ( DELETE FROM referencing_table WHERE mytable_id IN (select id FROM rows_to_delete) RETURNING mytable_id ), DELETE FROM mytable WHERE id IN (select id FROM deleted_rows);
It comes down to something like an algorithm:
scan rows for deletion as delete_set for dependents in the scan for rows dependent on deletion: delete dependent for_delete in the search for rows referenced by deleted dependents: delete to_delete
Eliminating the forced scan of a nested loop will significantly speed up the process.
source share