I have 7 linked tables, and one of the tables has a timestamp column, and I want to delete all rows older than 30 days. However, these are VERY big deletions. I speak tens of millions of records. If I delete all these records from the main table, I have to look at the other 6 tables and delete the related records from these tables.
My question is the best way to optimize this?
I am thinking about using PARTITION , but only one table has a timestamp column. I'm worried that if I drop the old partition in the main table, the related records will still exist in the other six tables. Related records are linked by fields (sid, cid).
In context, I use snort and barnyard, which are IDS processors.
I am using MySQL 5.1.73, MyISAM tables
Here is a snippet from the cleanup logs:
StartTime,EndTime,TimeElapsed,AffectedRows Wed Jan 6 01:00:01 EST 2016,Wed Jan 6 01:45:11 EST 2016,45:10,2911807 Thu Jan 7 01:00:02 EST 2016,Thu Jan 7 01:25:29 EST 2016,25:27,2230255 Fri Jan 8 01:00:01 EST 2016,Fri Jan 8 01:24:18 EST 2016,24:17,1400470 Sat Jan 9 01:00:02 EST 2016,Sat Jan 9 05:47:10 EST 2016,287:8,23360088 Sun Jan 10 01:00:01 EST 2016,Sun Jan 10 10:06:16 EST 2016,546:15,44970072 Mon Jan 11 01:00:01 EST 2016,Mon Jan 11 09:40:39 EST 2016,520:38,43948091
This was my old cleanup script:
/usr/bin/mysql --defaults-extra-file=/old/.my.cnf snort_db >> /root/snortcleaner.log 2>&1 <<EOF use snort_db; DROP TRIGGER IF EXISTS delete_old; DELIMITER
This is my current cleanup script:
DELETE FROM event WHERE timestamp BETWEEN DATE_SUB('${OLDEST_TIMESTAMP}', INTERVAL 1 HOUR) AND DATE_SUB(NOW(), INTERVAL 31 DAY); DELETE FROM data USING data LEFT OUTER JOIN event USING (sid,cid) WHERE event.sid IS NULL; DELETE FROM iphdr USING iphdr LEFT OUTER JOIN event USING (sid,cid) WHERE event.sid IS NULL; DELETE FROM icmphdr USING icmphdr LEFT OUTER JOIN event USING (sid,cid) WHERE event.sid IS NULL; DELETE FROM tcphdr USING tcphdr LEFT OUTER JOIN event USING (sid,cid) WHERE event.sid IS NULL; DELETE FROM udphdr USING udphdr LEFT OUTER JOIN event USING (sid,cid) WHERE event.sid IS NULL; DELETE FROM opt USING opt LEFT OUTER JOIN event USING (sid,cid) WHERE event.sid IS NULL;
I switch between them because I don’t know which is faster, but the reality is that both of them are too slow.