SQL Server - VS Guide A long

Question

SQL Server - VS Guide A long

So far I have used C # "Guid = Guid.NewGuid ();" method for creating a unique identifier that can be stored as an identifier field in some of my SQL Server database tables using Linq to SQL. I was told that for indexing reasons, using a GUID is a bad idea, and I should use auto-incrementing Long instead. Will you use the accelerated transaction of my database? If so, how do I start generating a unique identifier of type Long?

Hi,

+4

long-integer c # sql guid sql-server

Goober Jul 23 '09 at 11:41

source share

7 answers

The long (large int in sql server) is 8 bytes, and Guid is 16 bytes, so you reduce the number of bytes that the SQL server should compare when viewing.

To generate long, use IDENTITY (1,1) when creating a field in the database.

to either use the create table or alter table:

Field_NAME BIGINT NOT NULL PRIMARY KEY IDENTITY(1,1)

See comments for hosting Linq for sql

+3

kemiller2002 Jul 23 '09 at 11:42

source share

take a look at this

Is it better to use a unique identifier (GUID) or bigint for an identity column?

+3

Adriaan stander Jul 23 '09 at 11:50

source share

The "Indexing Queen" - Kim Tripp - basically says it all in her blog posts:

Basically, her best practices: the optimal clustering key should be:

unique
small
stable (never changing)
ever increasing

GUID violates the "small" and "increasing" and, therefore, is not optimal.

PLUS: all of your clustering keys will be added to each individual record in each separate non-clustered index (as a search for the actual search for the record in the database), so you want to make them as small as possible (INT = 4 bytes against GUID = 16 byte). If you have hundreds of millions of rows and multiple non-clustered indexes, choosing INT or BIGINT by GUID can make a big difference - even just by size.

Mark

+3

marc_s Jul 23 '09 at 13:52

source share

You can discuss a GUID or identity all day. I prefer the database to generate a unique value with an identifier. If you are combining data from multiple databases, add another column (to determine the source database, possibly tinyint or smallint) and form a composite primary key.

If you are going with an identifier, be sure to select the correct data type based on the number of expected keys that you will create:

 bigint - 8 Bytes - max positive value: 9,223,372,036,854,775,807 int - 4 Bytes - max positive value: 2,147,483,647

The note "number of expected keys" is different from the number of rows. If you basically add and save strings, you may find that INT is enough with over 2 billion unique keys. I bet your table won't get that big. However, if you have a table with large volumes in which you continue to add and delete rows, the number of rows in a row may be low, but you will quickly pass the keys. You have to do some calculations to find out how to keep a log in order to get through INTs 2 billion keys. If he will not use them soon, go to INT, otherwise double the key size and go to BIGINT.

+1

KM. Jul 23 '09 at 13:32

source share

Use the tips when you need to consider import / export to multiple databases. Guides are often easier to use than columns with an IDENTITY attribute when working with a dataset from multiple child relationships. this is due to the fact that you randomly generate commands in the code in the disconnected state from the database, and then send all the changes immediately. When prompts are generated properly, they are difficult to duplicate by accident. With identifier columns, you often have to enter the inner insertion of the parent row and request a new identity before adding child data. Then you need to update all child records with the new parent ID before passing them to the database. The same goes for grandchildren, etc. Down the hierarchy. He creates a lot of work that seems unnecessary and worldly. You can do something similar to Guides by combining with random integers without the IDENTITY specification, but the chance of a collision increases dramatically as you insert more records over time. (Guid.NewGuid () is like random Int128 - which does not exist yet).

I use bytes (TinyInt), Int16 (SmallInt), Int32 / UInt16 (Int), Int64 / UInt32 (BigInt) for small search lists that don't change, or data that doesn't replicate between multiple databases. (Permissions, Application Configuration, Color Names, etc.)

I believe that indexing takes as much time as queries, regardless of whether you use guid or long. Typically, tables typically index fields that are larger than 128 bits (usernames in the user table, for example). The difference between guides and integers is the size of the index in memory, as well as the time it takes to fill out and rebuild the indexes. Most database transactions are often read. The record is minimal. First, focus on optimizing reads from the database, as they usually consist of joined tables that were not optimized properly, improper swapping, or missing indexes.

As in any case, it is best to prove your point. create a test database with two tables. One with integer / long primary key and the other with a guide. Fill each one with N millionth rows. Monitoring the performance of each during CRUD operations (create, read, update, delete). You can find out that he has a performance hit, but a minor one.

Servers often run on mailboxes without debugging environments and other applications that take up the processor, memory, and hard disk I / O (especially with RAID). The development environment gives you an idea of performance.

+1

Lewie Jul 24 '09 at 21:23

source share

Consider creating a sequential GUID from a .NET application:

http://dotnet-snippets.de/dns/sequential-guid-SID998.aspx

What are the Sequential Guid performance improvements based on standard guidelines?

+1

Boris Modylevsky Jul 26 '09 at 15:44

source share

David · Accepted Answer · 2009-07-23T12:07:14+0000

Both have pros and cons, they completely depend on how you use them.

From the very beginning, if you need identifiers that can work across multiple databases, you need GUIDs. There are a few tricks with Long (manually assigning a different seed / increment to each database), but they do not scale well.

Regarding indexing, Long will give much better insert performance if the index is clustered (by default, primary keys are clustered, but this can be changed for your table), since the table does not need to be reorganized after each insert.

For parallel inserts, however, Long (identity) columns will be slower than the GUID. Generating an identity column requires a series of exclusive locks to ensure that only one row gets the next sequence number. In an environment where many users insert many rows all the time, this can be a performance hit. GUID generation is faster in this situation.

Storage memory, GUID takes up twice as much Long space (8 bytes versus 16). However, it depends on the total size of your string, if 8 bytes make a noticeable difference in the number of records in one sheet and, thus, the number of leaves extracted from the disk during the average request.

SQL Server - VS Guide A long

More articles: