Will an SQL update affect its subquery during the update run?

I am simply composing a complex update request that looks something like this:

update table join (select y, min(x) as MinX from table group by y) as t1 using (y) set x = x - MinX 

This means that the variable x updated based on a subquery that also processes the variable x - but can this x already be changed with the current update command ? Isn't that a problem? I mean, with normal programming, you usually have to deal with this explicitly, that is, store the new value in some other place from the old value and after completing the task, replace the old value with the new ... but how will the SQL database do it ?

I am not interested in any observation or experiment. I would like to have a snippet from a document or sql standard that will say what is the specific behavior in this case. I use MySQL, but the answers are true for other PostgresQL, Oracle, etc. And especially for the SQL standard as a whole . Thanks!

+8
sql database oracle mysql postgresql
source share
4 answers

** Edited **

Select from target table

From 13.2.9.8. Subqueries in the FROM section :

Subqueries in the FROM clause can return a scalar, column, row, or table. Subqueries in the FROM clause cannot be correlated subqueries if they are not used in the ON clause of the JOIN operation.

So yes, you can fulfill the above request.

Problem

There are two problems here. There is concurrency, or ensuring that no one else changes the data from under our feet. This is handled with a lock. Work with the actual modification of new and old values ​​is handled by views.

Lock

In the case of your query above, with InnoDB, MySQL first executes SELECT and obtains a separate read lock (shared lock) for each row in the table. If you had a WHERE clause in a SELECT statement, then only the selected records will be locked, where ranges can also block any gaps.

A read lock prohibits any other request from receiving write locks, so records cannot be updated from other sources while they are considered locked.

Then, MySQL gets a record lock (excluding) for each record in the table separately. If there was a WHERE clause in your UPDATE statement, then only the records would be locked, and again, if the WHERE clause selected a range, then your range will be locked.

Any record that had a read lock from a previous SELECT automatically goes into write lock.

A write lock prevents other requests from receiving a read or write lock.

You can use Innotop to see this by running it in lock mode, start the transaction, execute the request (but do not commit it), and you will see the locks in Innotop. Alternatively, you can view parts without Innotop using SHOW ENGINE INNODB STATUS .

Dead ends

Your request is vulnerable to a deadlock if two instances were running at the same time. If request A received a read lock, then request B received a read lock, request A had to wait until the read lock request B was issued before it could receive write locks. However, request B is not going to release read locks until it completes, and it does not complete if it cannot obtain write locks. Request A and request B are at an impasse and therefore at an impasse.

Therefore, you can explicitly lock tables to avoid a huge number of record locks (which use memory and affect performance) and avoid locking.

An alternative approach is to use SELECT ... FOR UPDATE on your inner SELECT. This starts with write locks on all lines, instead of starting with reading and escalating.

Derived tables

For an internal SELECT, MySQL creates a derived temporary table . The output table is the actual non-indexed copy of the data that lives in the temporary table that MySQL automatically creates (unlike the temporary table that you explicitly create and can add indexes to).

Since MySQL uses a view, this is a temporary old value that you refer to in your question. In other words, there is no magic. MySQL does this as if you were doing it somewhere else, with a temporary value.

You can see the view by executing EXPLAIN on your UPDATE statement (supported in MySQL 5.6 +).

+3
source share

The correct RDBMS uses statement level read consistency , which ensures that the operator sees (selects) the data as it was when the statement started. Thus, the scenario you are afraid of will not happen.

Yours faithfully,
Rob

+2
source share

Oracle has this in 11.2 Documentation

A consistent set of results is provided for each request, guaranteeing data consistency, without any action by the user. An implicit query, such as a query implied by a WHERE clause in an UPDATE statement, guarantees a consistent set of results. However, each statement in an implicit request does not see the changes made by the DML expression itself, but sees the data as it existed before the changes were made.

+1
source share

Although it was noted that you SHOULD NOT update the table based on your own data, you should be able to customize the MySQL syntax to allow it using

 update Table1, (select T2.y, MIN( T2.x ) as MinX from Table1 T2 group by T2.y ) PreQuery set Table1.x = Table1.x - PreQuery.MinX where Table1.y = PreQuery.y 

I don’t know if the syntax follows a different route using JOIN compared to the version of the comma list, but using the full prequest that you must apply first so that its result is completed by ONCE and joined (via WHERE) to actually perform the update.

0
source share

All Articles