Updating rows with sub query & join query is really dead slow

I need to reset the table flag 'A' from 'X' to 'Y' , where the update_date of the row satisfies the conditions 1. update_date > 1 month, 2. flag = 'X' & 3. type = 1 .

And update_date is checked on another table 'B' . I hope the following query will explain exactly what I need. Also this query is great for me. But the problem is that it takes too much time. In fact, my A & B tables are much larger, almost contain billions of rows, and there are about 10 of them.

When I run my extra query to select A.id , I got the result immediately.

  SELECT a.id FROM A a JOIN B b ON (a.id = b.id AND a.name = b.name AND a.type = 1 AND a.flag = 'X' AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH) tmp_table) 

But only an update request, even if I put a restriction, also takes a lot of time.

 UPDATE A SET flag='Y' WHERE id IN (SELECT a.id FROM A a JOIN B b ON (a.id = b.id AND a.name = b.name AND a.type = 1 AND a.flag = 'X' AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH) tmp_table)) LIMIT 100 

I am looking for alternative solutions to my request that make it fast. Hope I can write a stored procedure for it. But in SP do I need to scroll for each target_ids right?

I do not want to write two separate requests in PHP, as there are many threads of my PHP scripts running on cron that return the same results (delay time).

It should also be noted that indexing is sufficient for columns.

The desire to update the limits on the limit. those. update 1000+ entries for each run.

+4
source share
3 answers

Finally, I received a more efficient optimized query. Just join the temp table.

 UPDATE A AS a JOIN ( SELECT a.id FROM A AS a JOIN B AS b ON b.type = a.type AND b.name = a.name AND b.last_update_date < DATE_SUB(NOW(), INTERVAL 1 MONTH) AND a.type = 1 AND a.flag = 'X' ORDER BY a.id DESC LIMIT 1000) AS source ON source.id = a.id SET flag = 'Y'; 

Thanks http://www.xaprb.com/blog/2006/08/10/how-to-use-order-by-and-limit-on-multi-table-updates-in-mysql

0
source

Replace with

EXISTS will be faster, because as soon as the engine detects a hit, it will stop looking like the condition turned out to be true. Using IN, it will collect all the results from the subquery before further processing.

 UPDATE A a JOIN B b ON (a.id = b.id AND a.name = b.name AND a.type = 1 AND a.flag = 'X' AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH)) SET a.flag='Y' ORDER BY a.id LIMIT 1000; 

EDITED Auxiliary LIMIT Substitute (IT will update only 1,100 records)

 SET @rn = 0; UPDATE A a JOIN (SELECT @rn: =@rn +1 AS rId, id, name FROM B b JOIN A a ON (@rn < 100 AND a.id = b.id AND a.name = b.name AND a.type = 1 AND a.flag = 'X' AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH) ) ) b ON (a.id=b.id) SET a.flag='Y' WHERE b.rId < 100; 

Using an existing offer

 Update A a SET a.flag='Y' WHERE EXISTS (SELECT 1 FROM B b WHERE a.id = b.id AND a.name = b.name AND a.type = 1 AND a.flag = 'X' AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH)) ORDER BY a.id LIMIT 1000; 

Hope this helps

+3
source

You can also use connection

 UPDATE A LEFT JOIN (SELECT a.id FROM A AS a JOIN B AS b ON a.id = b.id WHERE a.name = b.name AND a.type = 1 AND a.flag = 'X' AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH)) AS l ON l.id = A.id SET flag = 'Y' WHERE id = l.id 
0
source

All Articles