Mysql Delete multiple random rows from a table

Now I have a table with 604,000 rows. I would like to delete 4,000 random rows, so my table will only contain 600,000 records.

Would there be a quick way to do this?

Many thanks.

+7
source share
4 answers

In theory, this will be random and fast. In practice, it will only be fast:

DELETE FROM tableX LIMIT 4000 

This will be random, but terribly slow, with 600K lines:

 DELETE FROM tableX ORDER BY RAND() LIMIT 4000 

This will not be truly random (since there are usually spaces in identifiers), and it may not even delete exactly 4,000 lines (but slightly less when there are many spaces), but probably faster than the previous one.

Additional packaging is required in the subquery, because the syntax for deleting from multiple tables does not allow LIMIT :

 DELETE td FROM tableX AS td JOIN ( SELECT t.id FROM tableX AS t CROSS JOIN ( SELECT MAX(id) AS maxid FROM tableX ) AS m JOIN ( SELECT RAND() AS rndm FROM tableX AS tr LIMIT 5000 ) AS r ON t.id = CEIL( rndm * maxid ) LIMIT 4000 ) AS x ON x.id = td.id 

Explain the output (from a subquery from the row table of 400 thousand):

 id table possible_keys key_len rows select_type type key ref Extra 1 PRIMARY <derived2> system 1 1 PRIMARY <derived3> ALL 5000 1 PRIMARY t eq_ref PRIMARY PRIMARY 4 func 1 Using where;Using index 3 DERIVED tr index PRIMARY 4 398681 Using index 2 DERIVED Select tables optimized away 
+13
source
 delete from yourTable limit 4000 
+1
source
 DELETE FROM TABLE ORDER BY RAND() LIMIT 4000; 

It will take some time though ...

A faster way to perform (do not write code!) Could be 4000 separate deletes in a loop

 DELETE FROM TABLE WHERE AssumedPKisInt = <ARandomNumber> 

Of course, you need to make sure that you are not trying to delete non-existent or already deleted lines.

0
source

If I had to risk a guess:

 DELETE FROM table where id = (SELECT id FROM table ORDER BY rand() LIMIT 1) LIMIT 10 
0
source

All Articles