TSQL: create user identity with user identity? (Database Change Management)

Question

TSQL: create user identity with user identity? (Database Change Management)

I would like to create a user id based on a user id. Or perhaps something similar to an identifier that functions as an auto-increment key.

For example, if I have a primary key for a drawing, I would like its revision to be based on the drawing number.

Example

  Draving
 ID |  REV |  INFO
 ------ + ------- + ------
 1 |  0 |  "Draw1"
 2 |  0 |  "Draw2"
 2 |  1 |  "Draw2Edit"
 2 |  2 |  "Draw2MoreEdit"
 3 |  0 |  "Draw3"
 4 |  0 |  "Draw4"

If I had to insert a few more records into my table:

INSERT INTO DRAWING (INFO) VALUES ("Draw5") INSERT INTO DRAWING (ID,INFO) VALUES (3,"Draw3Edit")

My table:

  Draving
 ID |  REV |  INFO
 ------ + ------- + ------
 1 |  0 |  "Draw1"
 2 |  0 |  "Draw2"
 2 |  1 |  "Draw2Edit"
 2 |  2 |  "Draw2MoreEdit"
 3 |  0 |  "Draw3"
 3 |  1 |  "Draw3Edit" --NEW ROW
 4 |  0 |  "Draw4"
 5 |  0 |  "Draw5" --NEW ROW

T-sql

 CREATE TABLE DRAWING ( ID INT, REV INT, INFO VARCHAR(50), PRIMARY KEY (ID,REV) ); CREATE TABLE CURRENT_DRAWING ( ID INT IDENTITY (1,1), DRAWING_ID INT, DRAWING_REV INT, PRIMARY KEY (ID), FOREIGN KEY (DRAWING_ID,DRAWING_REV) REFERENCES DRAWING (ID,REV) ON UPDATE CASCADE ON DELETE CASCADE );

I am using SQL Server Management Studio 2005 and working in a SQL Server 2000 database .

I will also be taking possible alternatives. The main goal is for the identifier to automatically increase for new drawings. The identifier will remain the same and the REV will increase in new drawings.

Update:

I think I have it close to what I want:

 DROP TABLE DRAW GO CREATE TABLE DRAW ( ID INT DEFAULT(0), REV INT DEFAULT(-1), INFO VARCHAR(10), PRIMARY KEY(ID, REV) ) GO CREATE TRIGGER TRIG_DRAW ON DRAW FOR INSERT AS BEGIN DECLARE @newId INT, @newRev INT, @insId INT, @insRev INT SET TRANSACTION ISOLATION LEVEL READ COMMITTED BEGIN TRANSACTION SELECT @insId = ID FROM inserted SELECT @insRev = REV FROM inserted PRINT 'BEGIN TRIG' PRINT @insId PRINT @insRev PRINT @newId PRINT @newRev --IF ID=0 THEN IT IS A NEW ID IF @insId <=0 BEGIN --NEW DRAWING ID=MAX+1 AND REV=0 SELECT @newId = COALESCE(MAX(ID), 0) + 1 FROM DRAW SELECT @newRev = 0 END ELSE --ELSE IT IS A NEW REV BEGIN --CHECK TO ENSURE ID EXISTS IF EXISTS(SELECT * FROM DRAW WHERE ID=@insId AND REV=0) BEGIN PRINT 'EXISTS' SELECT @newId = @insId SELECT @newRev = MAX(REV) + 1 FROM DRAW WHERE ID=@insID END ELSE --ID DOES NOT EXIST THEREFORE NO REVISION BEGIN RAISERROR 50000 'ID DOES NOT EXIST.' ROLLBACK TRANSACTION GOTO END_TRIG END END PRINT 'END TRIG' PRINT @insId PRINT @insRev PRINT @newId PRINT @newRev SELECT * FROM DRAW UPDATE DRAW SET ID=@newId , REV=@newRev WHERE ID=@insId COMMIT TRANSACTION END_TRIG: END GO INSERT INTO DRAW (INFO) VALUES ('DRAW1') INSERT INTO DRAW (INFO) VALUES ('DRAW2') INSERT INTO DRAW (ID,INFO) VALUES (2,'DRAW2EDIT1') --PROBLEM HERE INSERT INTO DRAW (ID,INFO) VALUES (2,'DRAW2EDIT2') INSERT INTO DRAW (INFO) VALUES ('DRAW3') INSERT INTO DRAW (INFO) VALUES ('DRAW4') GO --SHOULD THROW INSERT INTO DRAW (ID,INFO) VALUES (9,'DRAW9') GO SELECT * FROM DRAW GO

However, I keep getting Violation of PRIMARY KEY constraint .

I set up the debug statements and it is unlikely that I will violate my primary key:

  BEGIN TRIG
 0
 -one


 END TRIG
 0
 -one
 one
 0

 (1 row (s) affected)

 (1 row (s) affected)

 (1 row (s) affected)
 BEGIN TRIG
 0
 -one


 END TRIG
 0
 -one
 2
 0

 (2 row (s) affected)

 (1 row (s) affected)

 (1 row (s) affected)
 BEGIN TRIG
 2
 -one


 EXISTS
 END TRIG
 2
 -one
 2
 one

 (3 row (s) affected)
 Msg 2627, Level 14, State 1, Procedure TRIG_DRAW, Line 58
 Violation of PRIMARY KEY constraint 'PK__DRAW__56D3D912'.  Cannot insert duplicate key in object 'DRAW'.
 The statement has been terminated.

He is typing

  ID |  REV |  INFO
 ---- + -------- + ------------
 1 |  0 |  Draw1
 2 |  -1 |  DRAW2EDIT1 --This row is being updated to 2 1 
 2 |  0 |  Draw2

Before it completes with an error, and line 2 -1 is updated to 2 1. It should not violate my primary key.

+6

sql tsql identity sql-server-2000

user295190 Dec 17 '10 at 16:10

source share

2 answers

You can create an insert trigger that sets the value of rev

 CREATE TRIGGER RevTrigger ON DRAWING FOR INSERT AS WITH ins AS ( SELECT ID, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY {another-column}) AS sequence FROM inserted WHERE REV IS NULL -- only update rows where REV is not included ), draw AS ( SELECT ID, MAX(REV) AS REV FROM DRAWING GROUP BY ID ) UPDATE DRAWING SET REV = COALESCE(draw.REV + ins.sequence, 0) FROM DRAWING JOIN ins ON DRAWING.ID = ins.ID AND DRAWING.{another-column} = ins.{another-column} JOIN draw ON DRAWING.ID = draw.ID

You will not specify how to assign a REV value if several rows are added at the same time that have the same ID value. In other words, how will the audit be performed if several versions are added at the same time?

This solution assumes that in this case there will be an additional column that will determine the sequence of changes (see {another-column} above). If you do not have such a column, change ORDER BY {another-column} to ORDER BY 0 in the ROW_NUMBER function. And delete the following AND DRAWING.{another-column} = ins.{another-column} . After making this change, all insert rows with the same identifier will receive the same REV.

EDIT

 CREATE TRIGGER RevTrigger ON DRAWING FOR INSERT AS UPDATE DRAWING SET REV = COALESCE(draw.REV + 1, 0) FROM DRAWING JOIN inserted ON DRAWING.ID = inserted.ID AND DRAWING.{another-column} = inserted.{another-column} AND inserted.REV IS NULL JOIN ( SELECT ID, MAX(REV) AS REV FROM DRAWING GROUP BY ID ) AS draw ON DRAWING.ID = draw.ID

+2

bobs Dec 17 '10 at 17:42

source share

WCWedin · Accepted Answer · 2011-01-13T23:30:42+0000

I would recommend an alternative data design. This type of key-and-sequence pattern is very difficult to implement correctly in a relational database, and the disadvantages often outweigh the benefits.

You have quite a few options, but the simplest of them begin by breaking the table into two parts:

 CREATE TABLE DRAWING ( ID INT IDENTITY(1, 1), PRIMARY KEY (ID) ); CREATE TABLE DRAWING_REVISION ( ID INT IDENTITY(1, 1), DRAWING_ID INT, INFO VARCHAR(50), PRIMARY KEY (ID), CONSTRAINT FK_DRAWING_REVISION_DRAWING FOREIGN KEY (DRAWING_ID) REFERENCES DRAWING(ID) );

This allows you to accurately and correctly present data, without any additional effort on your part. Just add a row to the DRAWING_REVISION table if you want to add a new version to the drawing. Since primary keys use the IDENTITY specification, you do not need to search for the next ID .

The obvious solution and its drawback

If you need only a human-readable version number, and not just for <-23> just for your server, there are two ways to do this. They both start by adding REV INT to the data definition for DRAWING_REVISION , as well as CONSTRAINT UK_DRAWING_REVISION_DRAWING_ID_REV UNIQUE (DRAWING_ID, REV) . The trick, of course, is to find out the next version number for a given drawing.

If you expect only everyone to have a small number of concurrent users, you can simply SELECT MAX(REV) + 1 FROM DRAWING_REVISION WHERE DRAWING_ID = @DRAWING_ID either in the application code or in the INSTEAD OF INSERT trigger. However, with high concurrency or failure, users can block each other because they can try to insert the same combination of DRAWING_ID and REV into DRAWING_REVISION .

A bit of background

In fact, there is only one solution to this problem, although it explains why only one solution requires a bit of background information. Consider the following code:

 BEGIN TRAN INSERT DRAWING DEFAULT VALUES; INSERT DRAWING DEFAULT VALUES; SELECT ID FROM DRAWING; -- Output: 1, 2 ROLLBACK TRAN BEGIN TRAN INSERT DRAWING DEFAULT VALUES; SELECT ID FROM DRAWING; -- Output: 3 ROLLBACK TRAN

Of course, the output will be different in subsequent versions. Behind the scenes, the SQL server calculates IDENTITY values and increments the counter. If you never commit a value, the server does not try to "fill" the holes in the sequence - the values are provided only for forwarding.

This is a function, not an error. IDENTITY tables are designed to be unique and orderly, but not necessarily tightly packed. The only way to guarantee tight packaging is to serialize all incoming requests, making sure that each of them completes or ends before the next launch; otherwise, the server may try to backfill the IDENTITY value that was released half an hour ago, only to have a long transaction (that is, the original recipient of this IDENTITY value), to complete a row with a duplicate primary key.

(It is worth noting that when I say "transaction", which does not require a reference to TSQL TRANSACTION , although I would recommend using them. It can be absolutely any procedure on the side of the application or SQL server, it can take some time, even if this time is just the time it takes for the next version number, and right after that INSERT new DRAWING_REVISION .)

This attempt to re-populate the values is simply an override of serialization, since in a situation with two simultaneous INSERT requests, it punishes the second commit request. This forces the latter to ask to retry (perhaps several times until it happens that there is no conflict). At the same time, one successful presentation is obtained: serialization, although without the benefit of the queue.

The SELECT MAX(REV) + 1 approach has the same drawback. Naturally, the MAX approach does not attempt to re-populate the values, but it forces each parallel query to struggle with the same revision number with the same results.

Why is that bad? Database systems are designed for parallelism and currency: this ability is one of the main advantages of a managed database in a flat file format.

Fake right

So, after all that long exposition, what can you do to solve the problem? You could cross your fingers and hope that you will never see many simultaneous users, but why would you like to abandon the widespread use of your own application? After all, you do not want success to be your downfall.

The solution is to do what SQL Server does with the IDENTITY columns: piss them off, and then drop them. You can use something like the following SQL code or use the equivalent application code:

 ALTER TABLE DRAWING ADD REV INT NOT NULL DEFAULT(0); GO CREATE PROCEDURE GET_REVISION_NUMBER (@DRAWING_ID INT) AS BEGIN DECLARE @ATTEMPTS INT; SET @ATTEMPTS = 0; DECLARE @ATTEMPT_LIMIT INT; SET @ATTEMPT_LIMIT = 5; DECLARE @CURRENT_REV INT; LOOP: SET @CURRENT_REV = (SELECT REV FROM DRAWING WHERE DRAWING.ID = @DRAWING_ID); UPDATE DRAWING SET REV = @CURRENT_REV + 1 WHERE DRAWING.ID = @DRAWING_ID AND REV = @CURRENT_REV; SET @ATTEMPTS = @ATTEMPTS + 1; IF (@@ROWCOUNT = 0) BEGIN IF (@ATTEMPTS >= @ATTEMPT_LIMIT) RETURN NULL; GOTO LOOP; END RETURN @CURRENT_REV + 1; END

The @@ ROWCOUNT check is very important - this procedure should not be transactional, because you do not want to hide conflicts from simultaneous requests; you want to allow them. The only way to make sure your update has definitely passed is to check if any rows are updated.

Of course, you may have guessed that this approach is not flawless. The only way to "resolve" conflicts is to try several times before giving up. No brew home-based solution will be as good as one hard-coded database server software. But it can be pretty close!

A stored procedure does not resolve conflicts, but it significantly reduces the time interval over which a conflict can occur. Instead of “reserving” the revision number for the pending INSERT transaction, you get the latest version number and update the static counter as soon as possible to avoid another call to GET_REVISION_NUMBER . (This, of course, is serialized, but only for the very tiny part of the procedure that must be performed in a sequential manner, unlike many other methods, the rest of the algorithm can be executed in parallel.)

My team used a solution similar to the one described above, and we found that the frequency of blocking conflicts decreased by several orders of magnitude. We were able to send thousands of reverse requests from half a dozen machines on the local network before one of them got stuck.

The stuck machine got into the loop, requesting a new number from the SQL server, always getting a zero result. He could not say a word, so to speak. This is similar to conflicting behavior in the case of SELECT MAX , but much, much less often. You trade guaranteed sequential numbering of the SELECT MAX approach (and any related approach) to scale up a thousand times. This compromise is more or less fundamental: in my opinion, there is no guaranteed consistent, unserialized solution.

The takeaway

Of course, this whole approach depends on the need for a localized semi-sequential number. If you can live with less convenient revision numbers, you can simply set DRAWING_REVISION.ID . (Violating the keys to surrogates is dubious in its own way, though, if you ask me.)

The real benefit here is that custom identity columns are harder to implement than you might think at the beginning, and any application that might one day require scalability should be very careful in how it retrieves new user identity values.

TSQL: create user identity with user identity? (Database Change Management)

Update:

The obvious solution and its drawback

A bit of background

Fake right

The takeaway

More articles: