Why do composite primary keys still exist?

Question

Why do composite primary keys still exist?

I am tasked with moving the database to middle-class ERP. The new system uses composite primary keys here and there and from a pragmatic point of view, why?

Compared to auto-generated identifiers, I can only see negative aspects;

Foreign keys become blurry.
More complex migration or db redesign
Rigid as a business change. (My car does not have reg.plate ..)
The same integrity is achieved with limitations.

This is a return to the candy key design concept, which I also don’t see.

Is it a habit / artifact of floppy days (minimizing space / indices), or am I missing something?

// edit // I just found a good SO-post: Composite primary keys compared to the unique field of the object identifier //

+57

mysql sql-server database-design

Teson Mar 23 '11 at 13:35

source share

9 answers

Composite keys are required if your primary keys are not surrogate and inherently um, compound, that is, they are divided into several unrelated parts.

Some examples in the real world:

Many-to-many reference tables in which primary keys consist of keys of related objects.
Applications with multiple tenants, when tenant_id is part of the primary key for each object, and entities can only be connected within the same tenant (with foreign key restrictions).
Third-party applications (with primary keys already provided)

Note that logically all this can be achieved using the UNIQUE (in addition to the PRIMARY KEY surrogate).

However, there are some implementation-specific things:

Some systems will not allow a FOREIGN KEY to refer to anything that is not a PRIMARY KEY .
Some systems will only cluster the table on the PRIMARY KEY , so creating a PRIMARY KEY will increase the performance of queries related to the composite.

+56

Quassnoi Mar 23 2018-11-23T00:

source share

A component primary key provides better performance when it comes to using them as foreign keys in other tables and reduces the reading of tables — sometimes they can be lifesavers. If you are using surrogate keys, you need to go to this table to get information about the natural key.

For example (a pure example - so we are not talking about database design here), let's say that you have an ORDER and ORDER_ITEM . If you use ProductId and LineNumber ( UPDATE : and as Pedro mentioned OrderId or even better OrderNumber ) as the composite primary key in ORDER_ITEM , then in the cross-reference table for SHIPPING , you could have ProductId in SHIPPING_ORDERITEM . This can significantly increase productivity if, for example, you have exhausted this product and you need to know all the products of this ProductId that you need to send without having to enter.

On the other hand, if you use a surrogate key, you need to join it, and in the end you will get a very inefficient SQL execution plan, where you need to search through bookmarks for several indexes.

See more in the search in the bookmark , which with the help of surrogate keys becomes a serious problem.

+38

Aliostad Mar 23 2018-11-23T00:

source share

Natural primary keys are fragile.

Suppose we built a system around a natural PK (CountryCode, PhoneNumber), and in a few years we need to add an extension or change PK to one column: Email. If these PK columns extend to all child tables, this becomes very expensive.

A few years ago, there were some systems that were built on the assumption that the social security number is a natural PK, and had to be redesigned to use identifiers when the SSN became imperfect and nullified.

Since we cannot predict the future, we do not know whether any change will later make obsolete what used to be a completely correct and complete model.

+8

AK Mar 23 2018-11-11T00:

source share

The simplest answer is data integrity. If the data should be useful and accurate, then keys are apparently needed. The presence of an "auto-generated identifier" does not eliminate the requirements for other keys. The alternative is not ensuring unity and accepting that the data will be duplicated and almost inevitably contain anomalies and lead to errors as a result. Why do you need this?

+7

sqlvogel Mar 23 2018-11-23T00:

source share

In short, the purpose of composite keys is to use a database to enforce one or more business rules. In other words: protect the integrity of your data.

Ex. You have a list of parts that you buy from suppliers. You could create a table of suppliers and parts, for example:

 SUPPLIER SupplierId SupplierName PART PartId PartName SupplierId

Oh oh The parts table allows you to duplicate data. Since you used a surrogate key that was auto-generated, you do not apply the fact that part of the supplier must be entered only once. Instead, you should create the PART table as follows:

 PART SupplierId SupplierPartId PartName

In this example, your parts come from specific suppliers, and you want to apply the rule: "One supplier can only supply one part at a time" in the PARTS table. Hence the composite key. The combination key prevents accidental duplication of the item.

You can always leave business rules from your database and leave them in your application, but by observing the rule in the database (using a complex key), you guarantee that the business rule applies everywhere, especially if you ever decide to allow multiple applications data access.

+6

John Mar 23 2018-11-11T00:

source share

The short answer. Foreign keys with multiple columns naturally refer to the primary keys of multiple columns. There may still be a column with an auto-generated identifier that is part of the primary key.

Philosophical answer: The primary key is the row identifier . If there is some information that is an integral part of the string identifier (for example, which client belongs to this article ... in the wiki client of several clients). Information must be part of the primary key.

Example: a system for organizing LAN parties

The system supports several LAN parties with the same people and organizers who participate in this:

 CREATE TABLE users ( users_id serial PRIMARY KEY, ... );

And there are several sides:

 CREATE TABLE parties ( parties_id serial PRIMARY KEY, ... );

But most of the other material should carry information about which side it is associated with:

 CREATE TABLE ticket_types ( ticket_types_id serial, parties_id integer REFERENCES parties, name text, .... PRIMARY KEY(ticket_types_id, parties_id) );

... this is because we want to refer to primary keys . The foreign key in the attendance table points to the ticket_types table.

 CREATE TABLE attendances ( attendances_id serial, parties_id integer REFERENCES parties, ticket_types_id integer, PRIMARY KEY (attendances_id, parties_id), FOREIGN KEY (ticket_types_id, parties_id) REFERENCES parties );

+4

jkj Mar 23 2018-11-21T00:

source share

Similar to how functions encapsulate a set of instructions or representations of a database of abstract connections of a base table, therefore, to execute surrogate keys, abstract the meaning of the object on which they are placed.

If, for example, you have a table containing vehicle data using a surrogate version of VehicleId, then that means being a vehicle in terms of data. When you specify VehicleId = 1, you are certainly talking about some kind of vehicle, but do we know that this is a 2008 Chevy Impala or a 1991 Ford F-150? No. Is it possible at any time to change the basic data of any vehicle number 1? Yes.

+3

ses011 Mar 23 '11 at 19:57

source share

While I prefer surrogate keys, I use compound cases in several cases. The cumulative key may consist in whole or in part of surrogate key fields.

Many of the many join tables. In any case, as a rule, a unique key is required for a key pair. In some cases, additional columns may be included in the key.
Weak child tables. Things like order lines do not stand up on their own. In this case, I use the primary key of the parent (orders) table in the composite table.

When there are several weak tables associated with an entity, it may be possible to exclude the table from the set of joins when querying child data. In the case of grandchildren tables, you can join your grandfather and grandson without involving a table in the middle.

+2

BillThor Mar 23 '11 at 17:00

source share

HLGEM · Accepted Answer · 2011-03-23 13:53

Personally, I prefer to use surrogate keys. However, when combining tables that consist only of identifiers from two other tables (to create many-to-many relationships), compound keys are the way to go, and thus deleting them will make it difficult to work.

There is a school of thought that surrogate keys are always bad, and if you do not have uniqueness for writing through the use of natural keys, you have a poor design. I strongly disagree with this (if you do not store an SSN or some other unique value, I cannot come up with a natural key for a people table, for example.) But many people think that this is necessary for normal normalization.

Sometimes having a composite key reduces the need to connect to another table. This is sometimes not the case. Thus, there are times when a composite key can improve performance, as well as times when it can harm performance. If the key is relatively stable, you may be in order with higher performance for your selected queries. However, if this is something that can change, like the name of the company, you may find yourself in a world of resentment when company A changes its name and you need to update a million related records.

There is no single size in database design. There are times when compound keys are useful and times when they are terrible. There are times when surrogate keys are useful, and when not.

Why do composite primary keys still exist?

More articles: