Identity / primary key management on servers

Question

Identity / primary key management on servers

I'm in the middle of developing a new database that should support replication, and I'm stuck in deciding what to choose as the primary key.

In our current database, we use two int columns for the primary key, the first column is private, and the other is used to describe which server the row is inserted on. Now I want to avoid using two columns for the primary key and just use only one column. So far, I have two ways to do this:

Use a GUID for my primary key
This ensures that on any number of servers there will always be a unique key. What I don't like about this is that the GUID is 16 bytes in size, and when used for a foreign key in many tables, it will be empty. It is also harder to use it when writing queries, and it will be slower to query.
Use int or bigint and manually specify the seed and increment value for each table on each server. For example, if there are two servers, table X on the first server will start at number 1, and on the second server it will start at number 2, each will increase by 2. Thus, there will be (1,3,5 ...) on the first and (2,4,6, ...) on the second server. The good thing with this design is that it is easier to use when writing queries, it is fast, and it uses less space for foreign keys. The bad news is that we never know how many servers will work, so it’s harder to say what the increment value will be. It is also more difficult to manage schema changes on the server.

What is the best practice for managing multiple servers and what is the best way, if any, to do in this case, if the situation?

+8

sql-server identity primary-key

Mladen macanović Sep 08 '11 at 12:30

source share

3 answers

Charl · Answer 1 · 2011-12-21T05:23:15+0000

Your question is good, and one that is often asked.

In terms of service, I would completely go with GUIDS. They are there for a reason. Somewhere along the line, you may encounter complex operations of moving and replicating your data, and then other options may make it a little more complicated than it should be.

It reads very well about various options:

http://msdn.microsoft.com/en-us/library/bb726011.aspx

As for the part of replication - if everything is done correctly, there are no real replicas.

Bryan · Answer 2 · 2011-09-08T13:04:32+0000

Update:

Found a simpler / manual method here . Includes use of NOT FOR REPLICATION and stunning seed values, as you mentioned in the comments.

Original:

Your best bet is something like the second option. Assign ID ranges for replication publisher instances and subscribers, then enable automatic range management.

This article discusses options for managing identity columns in replication and enabling identity range management .

Since you do not know how many servers will be in the replication pool, you may need to periodically reconfigure article properties.

Rbjz · Answer 3 · 2011-10-03T10:08:49+0000

I dare to advise against replication in general :) this, of course, is a pain than pleasure. If you can afford it, look at the Sync structure .

Playing with an identity is not flexible, to say the least. Consider adding moving servers. ID insertion, various schemes, etc.

The GUID will be fine for the cluster key if you used newsequentialid () as the default value. It is a bit larger (a few bits), but it solves the problem once and for all :)

No matter how I go, it is to have a cluster key with an identifier int, which is relevant only to the database context. Then add a GUID column, which makes sense for a synchronization context. Put it in the rowversion column to see what is ready for synchronization.

Identity / primary key management on servers

More articles: