Is string or int preferable for foreign keys?

I have a user table with userid and username columns, and both of them are unique.

Between userid and username , what would be better to use as a foreign key and why?
My boss wants to use a string, is that normal?

+9
sql mysql foreign-keys ddl
source share
4 answers

It looks like you have both a surrogate key ( int userId ) and a natural key ( char or varchar username ). Any column can be used as the primary key for the table, and in any case, you can still ensure the uniqueness of another key.

There are many discussions about the trade-offs between natural and surrogate keys - you will need to decide what works for you and what is the β€œstandard” in your organization.

Here are some considerations when choosing a method:

Case for using surrogate keys (e.g. UserId INT AUTO_INCREMENT)

If you use a surrogate (for example, UserId INT AUTO_INCREMENT ) as the primary key, then all tables that reference the MyUsers table should then use UserId as a foreign key.

However, you can still ensure that the user's username column is unique by using an additional unique index , for example:

 CREATE TABLE 'MyUsers' ( 'userId' int NOT NULL AUTO_INCREMENT, 'username' varchar(100) NOT NULL, ... other columns PRIMARY KEY('userId'), UNIQUE KEY UQ_UserName ('username') 

According to @Dagon, using a narrow primary key (like int ) has performance and storage advantages over using a wider (and variable length) value like varchar . This advantage also affects other tables that reference MyUsers , as the foreign key for userid will narrow.

Another advantage of the surrogate integer key is that the username can be easily changed without affecting the tables that reference MyUsers . If username used as a natural key, then the tables were associated with MyUsers using username , which makes it more inconvenient to change the username (since otherwise the foreign key relationship would be violated). If you want to update user names for tables using username as a foreign key, you will need to use a method similar to ON UPDATE CASCADE to maintain data integrity.

Natural Keys use case (i.e. username)

On the other hand, to use surrogate keys for other tables that reference MyUsers through a surrogate key, you always need to join with the MyUsers table to get the username. One of the potential benefits of natural keys is that if a query requires only the " Username column from the table MyUsers to" MyUsers , it does not need to join MyUsers to get the username, which saves some overhead.

Further references to natural versus surrogate debate and compromise here and here

+23
source share

int will index faster, it may or may not be a problem, it's hard to say based on what you provided

+3
source share

The int value is 4 bytes, a string can contain as many bytes as you need. Because of this, int will always work better. Unless, of course, if you stick to usernames less than 4 characters long :)

In addition, you should never use a column as PK / FK if the data inside the column itself can change. Users usually change their usernames, and even if this function does not exist in your application right now, it will be in a few years. When that day comes, you may have 1000 tables that reference this user table, and then you will have to update all 1000 tables in the transaction, and this is just bad.

+3
source share

It depends on the foreign key: if your company controls it, I recommend using Int if there is an ID field for it. However, sometimes the identifier field is not in the table, since another key makes sense as an alternative unique key. Thus, the ID field may be a surrogate key in this case.

Rule of thumb. The foreign key data type must match the primary key data type.

There is an exception: what about foreign keys that do not belong to your company? What about foreign keys to databases and APIs that you cannot control? These identifiers should always be IMO strings.

To convince you, I ask the following questions:

Do you do math? Are you increasing it? Do you have control over this? The APIs are notorious for changes, even data types MAY be changed in someone else's database ... so how much will this bother you when the int ID becomes hex?

0
source share

All Articles