To simultaneously extract (select) or create (insert) a new row in shared SQL without conflicts

I have a system that has a complex primary key for interacting with external systems and a quick, small, opaque primary key for internal use. For example: a foreign key can be a compound value β€” something like (first name (varchar), last name (varchar), zip code (char)), and the internal key will be an integer ("customer ID").

When I receive an incoming request with a foreign key, I need to look for the internal key - and the hard part here is to assign a new internal key if I do not already have one for this foreign ID.

Obviously, if I have only one client that is talking to the database at a time, this is normal. SELECT customer_id FROM customers WHERE given_name = 'foo' AND ... , then INSERT INTO customers VALUES (...) if I do not find the value. But, if there are potentially many requests coming from external systems at the same time, and many of them can come to a previously unheard of client at the same time, there is a race condition when several clients can try INSERT new line.

If I were modifying an existing line, that would be easy; just SELECT FOR UPDATE first to get the corresponding row level lock before doing UPDATE . But in this case, I do not have a row that I can block, because the row does not exist yet!

So far I have come up with several solutions, but each of them has quite serious problems:

  • Catch the error on INSERT , retry the entire transaction from above. This is a problem if the transaction includes a dozen clients, especially if the incoming data potentially speaks of the same clients in a different order each time. This can get stuck in mutually recursive deadlock cycles, where each time a conflict arises with another client. You can mitigate this by exponentially waiting between retries, but this is a slow and costly way to deal with conflicts. In addition, it is quite difficult to complicate the application code, since everything must be reloaded.
  • Use savepoints. Run the savepoint until SELECT , catch the error on INSERT , and then go back to the savepoint and SELECT . Savepoints are not fully portable, and their semantics and capabilities differ slightly and subtly between databases; the biggest difference I noticed is that sometimes they seem to nest, and sometimes they don’t, so it would be nice if I could avoid them. This is only a vague impression, although - is it inaccurate? Are storage points standardized, or at least practically constant? In addition, savepoints make it difficult to execute a single transaction in parallel, because you cannot say exactly how much work you will roll back, although I understand that I just need to live with it.
  • Get some global lock, for example, table-level locking using the LOCK statement ( oracle mysql postgres ). This obviously slows down these operations and leads to a lot of lock conflicts, so I would rather avoid this.
  • Get a finer, but database-specific lock. I am only familiar with the Postgres method of this , which is definitely not supported in other databases (functions even start with " pg_ "), and again this is a portability problem. Also, a postgres way to do this would require me to convert the key to a pair of integers anyway, which it might not match. Is there a better way to get locks for hypothetical objects?

It seems to me that this was a common problem with the w390 databases, but I could not find many resources on it; perhaps only because I don’t know the canonical phrase. Is it possible to do this with some simple extra bit of syntax in any of the marked databases?

+7
source share
3 answers

I don’t understand why you cannot use INSERT IGNORE, which will work without errors, and you can check if an insert (modified records) has occurred. If the insert doesn’t work, then you know that the key already exists, and you can do SELECT. You can do INSERT first, then SELECT.

Alternatively, if you use MySQL, use InnoDB, which supports transactions. This will make rollbacks easier.

+3
source

Perform "search or create" operations for each client in autosave mode before and outside the main transaction with multiple clients.

+1
source

WRT creates an opaque primary key, there are a number of options, for example, use the guid table or (at least with Oracle) a sequence table. WRT protecting a foreign key is unique; apply a unique constraint to the column. If the insert fails because the key exists, try fetching again. You can use the insert where it does not exist or where it does not. Use a stored procedure to reduce round trips and increase productivity.

0
source

All Articles