I would recommend an alternative data design. This type of key-and-sequence pattern is very difficult to implement correctly in a relational database, and the disadvantages often outweigh the benefits.
You have quite a few options, but the simplest of them begin by breaking the table into two parts:
CREATE TABLE DRAWING ( ID INT IDENTITY(1, 1), PRIMARY KEY (ID) ); CREATE TABLE DRAWING_REVISION ( ID INT IDENTITY(1, 1), DRAWING_ID INT, INFO VARCHAR(50), PRIMARY KEY (ID), CONSTRAINT FK_DRAWING_REVISION_DRAWING FOREIGN KEY (DRAWING_ID) REFERENCES DRAWING(ID) );
This allows you to accurately and correctly present data, without any additional effort on your part. Just add a row to the DRAWING_REVISION table if you want to add a new version to the drawing. Since primary keys use the IDENTITY specification, you do not need to search for the next ID .
The obvious solution and its drawback
If you need only a human-readable version number, and not just for <-23> just for your server, there are two ways to do this. They both start by adding REV INT to the data definition for DRAWING_REVISION , as well as CONSTRAINT UK_DRAWING_REVISION_DRAWING_ID_REV UNIQUE (DRAWING_ID, REV) . The trick, of course, is to find out the next version number for a given drawing.
If you expect only everyone to have a small number of concurrent users, you can simply SELECT MAX(REV) + 1 FROM DRAWING_REVISION WHERE DRAWING_ID = @DRAWING_ID either in the application code or in the INSTEAD OF INSERT trigger. However, with high concurrency or failure, users can block each other because they can try to insert the same combination of DRAWING_ID and REV into DRAWING_REVISION .
A bit of background
In fact, there is only one solution to this problem, although it explains why only one solution requires a bit of background information. Consider the following code:
BEGIN TRAN INSERT DRAWING DEFAULT VALUES; INSERT DRAWING DEFAULT VALUES; SELECT ID FROM DRAWING;
Of course, the output will be different in subsequent versions. Behind the scenes, the SQL server calculates IDENTITY values and increments the counter. If you never commit a value, the server does not try to "fill" the holes in the sequence - the values are provided only for forwarding.
This is a function, not an error. IDENTITY tables are designed to be unique and orderly, but not necessarily tightly packed. The only way to guarantee tight packaging is to serialize all incoming requests, making sure that each of them completes or ends before the next launch; otherwise, the server may try to backfill the IDENTITY value that was released half an hour ago, only to have a long transaction (that is, the original recipient of this IDENTITY value), to complete a row with a duplicate primary key.
(It is worth noting that when I say "transaction", which does not require a reference to TSQL TRANSACTION , although I would recommend using them. It can be absolutely any procedure on the side of the application or SQL server, it can take some time, even if this time is just the time it takes for the next version number, and right after that INSERT new DRAWING_REVISION .)
This attempt to re-populate the values is simply an override of serialization, since in a situation with two simultaneous INSERT requests, it punishes the second commit request. This forces the latter to ask to retry (perhaps several times until it happens that there is no conflict). At the same time, one successful presentation is obtained: serialization, although without the benefit of the queue.
The SELECT MAX(REV) + 1 approach has the same drawback. Naturally, the MAX approach does not attempt to re-populate the values, but it forces each parallel query to struggle with the same revision number with the same results.
Why is that bad? Database systems are designed for parallelism and currency: this ability is one of the main advantages of a managed database in a flat file format.
Fake right
So, after all that long exposition, what can you do to solve the problem? You could cross your fingers and hope that you will never see many simultaneous users, but why would you like to abandon the widespread use of your own application? After all, you do not want success to be your downfall.
The solution is to do what SQL Server does with the IDENTITY columns: piss them off, and then drop them. You can use something like the following SQL code or use the equivalent application code:
ALTER TABLE DRAWING ADD REV INT NOT NULL DEFAULT(0); GO CREATE PROCEDURE GET_REVISION_NUMBER (@DRAWING_ID INT) AS BEGIN DECLARE @ATTEMPTS INT; SET @ATTEMPTS = 0; DECLARE @ATTEMPT_LIMIT INT; SET @ATTEMPT_LIMIT = 5; DECLARE @CURRENT_REV INT; LOOP: SET @CURRENT_REV = (SELECT REV FROM DRAWING WHERE DRAWING.ID = @DRAWING_ID); UPDATE DRAWING SET REV = @CURRENT_REV + 1 WHERE DRAWING.ID = @DRAWING_ID AND REV = @CURRENT_REV; SET @ATTEMPTS = @ATTEMPTS + 1; IF (@@ROWCOUNT = 0) BEGIN IF (@ATTEMPTS >= @ATTEMPT_LIMIT) RETURN NULL; GOTO LOOP; END RETURN @CURRENT_REV + 1; END
The @@ ROWCOUNT check is very important - this procedure should not be transactional, because you do not want to hide conflicts from simultaneous requests; you want to allow them. The only way to make sure your update has definitely passed is to check if any rows are updated.
Of course, you may have guessed that this approach is not flawless. The only way to "resolve" conflicts is to try several times before giving up. No brew home-based solution will be as good as one hard-coded database server software. But it can be pretty close!
A stored procedure does not resolve conflicts, but it significantly reduces the time interval over which a conflict can occur. Instead of “reserving” the revision number for the pending INSERT transaction, you get the latest version number and update the static counter as soon as possible to avoid another call to GET_REVISION_NUMBER . (This, of course, is serialized, but only for the very tiny part of the procedure that must be performed in a sequential manner, unlike many other methods, the rest of the algorithm can be executed in parallel.)
My team used a solution similar to the one described above, and we found that the frequency of blocking conflicts decreased by several orders of magnitude. We were able to send thousands of reverse requests from half a dozen machines on the local network before one of them got stuck.
The stuck machine got into the loop, requesting a new number from the SQL server, always getting a zero result. He could not say a word, so to speak. This is similar to conflicting behavior in the case of SELECT MAX , but much, much less often. You trade guaranteed sequential numbering of the SELECT MAX approach (and any related approach) to scale up a thousand times. This compromise is more or less fundamental: in my opinion, there is no guaranteed consistent, unserialized solution.
The takeaway
Of course, this whole approach depends on the need for a localized semi-sequential number. If you can live with less convenient revision numbers, you can simply set DRAWING_REVISION.ID . (Violating the keys to surrogates is dubious in its own way, though, if you ask me.)
The real benefit here is that custom identity columns are harder to implement than you might think at the beginning, and any application that might one day require scalability should be very careful in how it retrieves new user identity values.