Naming identifier columns in database tables

I was interested in hearing people's opinions on naming identifier columns in database tables.

If I have a table called Invoices with the primary key of the identification column, I would name this column InvoiceID so that I do not conflict with other tables, and it is obvious what it is.

In the cases when I work, they call all the IDs of the ID columns.

Therefore, they will do the following:

Select i.ID , il.ID From Invoices i Left Join InvoiceLines il on i.ID = il.InvoiceID 

Now I see a few problems here:
1. You will need to use the column aliases in the list 2. ID = InvoiceID does not fit into my brain
3. If you did not use table aliases and referenced InvoiceID, then obviously, what table is it in?

What do other peoples think on this subject?

+73
sql naming-conventions
Oct 16 '08 at 13:34
source share
24 answers

Identifier is SQL Antispam. See http://www.amazon.com/s/ref=nb_sb_ss_i_1_5?url=search-alias%3Dstripbooks&field-keywords=sql+antipatterns&sprefix=sql+a

If you have many tables with an ID, you are reporting a lot more difficult. This obscures the meaning and makes it difficult to read complex queries, and also requires the use of aliases to differentiate the report itself.

Also, if someone is stupid enough to use a natural join in the database where they are available, you will join the wrong records.

If you want to use the USING syntax allowed by some dbs, you cannot use the identifier.

If you use an identifier, you can easily get an erroneous connection if you are going to copy the syntax of the connection (don’t tell me that no one ever does this!) And forget to change the alias in the connection condition.

So now you have

 select t1.field1, t2.field2, t3.field3 from table1 t1 join table2 t2 on t1.id = t2.table1id join table3 t3 on t1.id = t3.table2id 

when did you mean

 select t1.field1, t2.field2, t3.field3 from table1 t1 join table2 t2 on t1.id = t2.table1id join table3 t3 on t2.id = t3.table2id 

If you use tablenameID as the id field, such a random error is much less likely and much easier to find.

+12
Sep 21 '11 at 17:41
source share

I always preferred ID for TableName + ID for id column and then TableName + ID for foreign key. Thus, all tables have the same name for the id field and there is no redundant description. This seems simpler to me because all tables have the same primary key field name.

As for joining tables and not knowing which Id field belongs to which table, in my opinion, the query should be written to handle this situation. Where I work, we always prefer the fields that we use in the instructions with the alias table / table.

+113
Oct 16 '08 at 13:46
source share

Recently, my company had a boot. The advent of LINQ has made the redundant table tablename + ID even more stupid in my eyes. I think most reasonable people will say that if you manually write your SQL in such a way that you must specify table names to differentiate FK, then this is not only a saving on typing, but also adding clarity to your SQL to use only IDs, in which you can clearly see what is PC and which is FK.

T. LEFT JOIN Clients ON Employee.ID = Customer.EmployeeID

tells me not only that the two are connected, but also PK, and FK

whereas in the old style you are forced to either look or hope that they were named well.

+37
Oct. 16 '08 at 15:30
source share

We use InvoiceID , not ID . This makes queries more readable - when you see only ID , it can mean anything, especially when you assign a table to table i .

+28
Oct 16 '08 at 13:36
source share

I agree with Keven and several other people here that the PK for the table should just be Id, and the foreign keys list OtherTable + Id.

However, I want to add one reason that has recently given this argument more weight.

In my current position, we are using an entity structure using POCO generation. Using the standard Id naming convention, PK allows you to inherit the poco base class with validation and one for tables that share a set of common column names. Using Tablename + Id as PK for each of these tables destroys the possibility of using a base class for them.

Just food for thought.

+11
Jun 07 2018-12-12T00:
source share

This is not very important, you are likely to run into simalar problems in all naming conventions.

But it’s important to be consistent, so you don’t need to look at table definitions every time you write a query.

+10
Oct 16 '08 at 13:40
source share

My advantage is also the identifier for the primary key and TableNameID for the foreign key. I also like to have a column "name" in most tables, where I store a user readable identifier (i.e. Name :-)) of the record. This structure provides more flexibility in the application itself, I can process tables in bulk, in the same way. This is a very powerful thing. Typically, OO software is built on top of a database, but the OO toolkit cannot be used because db itself does not allow it. The column id and name are still not very good, but this is a step.

Choose
i.ID, il.ID From the Invoice, I Left join invoices il at i.ID = il.InvoiceID

Why can't I do this?

 Select Invoices.ID , InvoiceLines.ID From Invoices Left Join InvoiceLines on Invoices.ID = InvoiceLines.InvoiceID 

In my opinion, it is very readable and simple. Naming variables like me and il is a poor choice overall.

+9
Sep 21 '11 at 15:34
source share

I just started working in a place that uses only "ID" (in the main tables referenced by TableNameID in foreign keys), and already found two production problems at once.

In one case, the query uses "... where ID in (SELECT ID FROM OtherTable ..." instead of "... where ID in (SELECT TransID FROM OtherTable ...".

Can anyone honestly say that it would not be so easy to detect if full, consistent names were used, where the incorrect statement would be read "... where is the TransID in (SELECT OtherTableID from OtherTable ..."? I don’t think .

Another problem arises when refactoring code. If you are using a temporary table, whereas previously the query deleted the main table, then the old code reads "... dbo.MyFunction (t.ID) ...", and if this is not changed, but "t" now refers to temp instead main table, you don’t even get an error - just erroneous results.

If generating unnecessary errors is the goal (maybe some people lack work?), Then such a naming convention is great. Otherwise, sequential naming is the way to go.

+7
Nov 16 '10 at 22:00
source share

For simplicity, most people call a table identifier column. If it has a foreign key reference in another table, then they explain it to InvoiceID (for using your example) in the case of joins, you still overlay the table so that explicit inv.ID is still easier than inv.InvoiceID

+6
Oct. 16 '08 at 13:38
source share

Based on this, from the point of view of a formal data dictionary, I would call the data element invoice_ID . Typically, the name of the data item will be unique in the data dictionary and, ideally, will have the same name, although sometimes additional qualification conditions based on context may be required, for example. a data item named employee_ID can be used twice in the org diagram and therefore qualify as supervisor_employee_ID and subordinate_employee_ID respectively.

Obviously, naming conventions are subjective and share style. I find the recommendations of ISO / IEC 11179 to be a useful starting point.

For a DBMS, I see tables as collections of entites (except for those that contain only one row, for example, a cofig table, a constant table, etc.), for example. the table where my employee_ID is the key will be called Personnel . Therefore, immediately the TableNameID convention does not work for me.

I saw the TableName.ID=PK TableNameID=FK style TableName.ID=PK TableNameID=FK used for large data models, and I must say that I find it somewhat confusing: I prefer the identifier name to be the same in everything, i.e. did not change the name based on which table appears Something to note: the above style seems to be used in stores that add an IDENTITY (auto-increment) column for each table, avoiding the natural and complex keys in foreign keys. These stores, as a rule, do not have formal data dictionaries and do not build from data models. Again, this is just a matter of style and one to which I personally do not subscribe. So ultimately, this is not for me.

All that said, I see a case where sometimes the qualifier is dropped from the column name, when the table name provides context for this, for example. an element named employee_last_name can simply be the last_name in the Personnel table. The rationale here is that the domain is "people's last names" and most likely will be UNION ed with last_name columns from other tables, and not be used as a foreign key in another table, but then again ... I can just change my mind sometimes you can never tell. This is what: data modeling is part of art, part of science.

+4
Oct 16 '08 at 19:32
source share

I personally prefer (as mentioned above) Table.ID for PK and TableID . > for FK . Even (please don't shoot me) Microsoft Access recommends this.

HOWEVER, I also know that some generating tools support TableID for PC, because they tend to bind the entire column name containing the 'ID' in the word, INCLUDING ID !!!

Even the query designer does this on Microsoft SQL Server (and for each query you create, you end up copying all unnecessary newly created relationships in all tables with the column ID)

THUS, since my internal OCD hates him, I swing using the TableID convention . Remember that it is called Data BASE , as it will become the basis for many many applications. And all technologies should use a well-standardized with a clear description scheme.

It goes without saying that I draw my line when people start using TableName, TableDescription, etc. In my opinion, agreements should do the following:

  • Table Name: Pluralized. Ex. Employees
  • Table alias: The full name of the table, singular. Example.

     SELECT Employee.*, eMail.Address FROM Employees AS Employee LEFT JOIN eMails as eMail on Employee.eMailID = eMail.eMailID -- I would sure like it to just have the eMail.ID here.... but oh well 

[Update]

In addition, there are several valid duplicate column messages in this thread due to a "kind of relationship" or role. For example, if the store has an EmployeeID that tells me to squat. So sometimes I do something like Store.EmployeeID_Manager . Of course, this is a bit more, but on faces, people will not go crazy trying to find TableID Manager or what EmployeeID works there. When the query is WHERE, I would simplify it as: SELECT EmployeeID_Manager as ManagerID FROM Store

+4
Aug 14 '13 at 21:45
source share

I think you can use anything for "ID" as long as you agree. Including a table name is important. I would suggest using a modeling tool such as Erwin to enforce naming conventions and standards, so when writing queries it is easy to understand the relationships that may exist between tables.

What I mean by the first statement is that instead of an ID, you can use something else, for example, "recno". So, in this table will be PK invoice_recno, etc.

Cheers, Ben

+2
Oct. 16 '08 at 15:12
source share

My vote is for InvoiceID for table id. I also use the same naming convention when it is used as a foreign key, and uses smart aliases in queries.

  Select Invoice.InvoiceID, Lines.InvoiceLine, Customer.OrgName From Invoices Invoice Join InvoiceLines Lines on Lines.InvoiceID = Invoice.InvoiceID Join Customers Customer on Customer.CustomerID = Invoice.CustomerID 

Of course, this is longer than some other examples. But smile. This is for posterity and someday, some poor junior coder will have to change your masterpiece. There is no ambiguity in this example, and as you add additional queries to the query, you will be grateful for the verbosity.

+2
Oct. 16 '08 at 15:32
source share

For the column name in the database, I use "InvoiceID".

If I copy fields to an unnamed structure via LINQ, I can name it “ID” if this is the only identifier in the structure.

If the column will NOT be used in an external key, so that it will only be used to uniquely identify the row to edit or delete the edit, I will call it "PC".

+1
Oct. 16 '08 at 13:43
source share

If you give each key a unique name, for example. "invoices.invoice_id" instead of "invoices.id", then you can use the "natural join" and "use" operators without any problems. For example.

 SELECT * FROM invoices NATURAL JOIN invoice_lines SELECT * FROM invoices JOIN invoice_lines USING (invoice_id) 

instead

 SELECT * from invoices JOIN invoice_lines ON invoices.id = invoice_lines.invoice_id 

SQL is verbose enough without making it more verbose.

+1
Oct 16 '08 at 14:33
source share

What I do so that everything is consistent for me (where the table has the primary key of one column, used as an identifier), is the name of the primary key of the table Table_pk . Anywhere I have a foreign key pointing to the primary key of these tables, I call the PrimaryKeyTable_fk column. Thus, I know that if I have a Customer_pk table in my Customer table and Customer_fk in my order table, I know that the Order table refers to an entry in the Customer table.

For me, this is especially important for associations, where, in my opinion, it is easier to read.

 SELECT * FROM Customer AS c INNER JOIN Order AS c ON c.Customer_pk = o.Customer_fk 
+1
Oct 17 '08 at 18:49
source share

FWIW, our new standard (which is changing, I mean "developing", with each new project):

  • Lower case database field names
  • Top Level Table Names
  • Use underscores to separate words in the field name - convert them to Pascal in code.
  • pk_ prefix means primary key
  • _id suffix means integer, auto-increment ID
  • fk_ prefix means foreign key (no suffix)
  • _VW suffix for views
  • is_ prefix for booleans

So, a table named NAMES can have the fields pk_name_id, first_name, last_name, is_alive, and fk_company and a view called LIVING_CUSTOMERS_VW , defined as:

 SELECT first_name, last_name
 FROM CONTACT.NAMES
 WHERE (is_alive = 'True')

As others have said, although any circuit will work as long as it is consistent and does not unnecessarily confuse your values.

+1
Oct 17 '08 at 19:16
source share

I hate a pseudonym. I prefer to always use invoice_id or its variant. I always know which table is the authoritative table for the identifier when I need it, but it bothers me

 SELECT * from Invoice inv, InvoiceLine inv_l where inv_l.InvoiceID = inv.ID SELECT * from Invoice inv, InvoiceLine inv_l where inv_l.ID = inv.InvoiceLineID SELECT * from Invoice inv, InvoiceLine inv_l where inv_l.ID = inv.InvoiceID SELECT * from Invoice inv, InvoiceLine inv_l where inv_l.InvoiceLineID = inv.ID 

What is the worst of what you mentioned is completely confused. I had to work with a database where it was almost always foo_id, with the exception of one of the most used identifiers. It was a complete hell.

0
Oct. 16 '08 at 13:38
source share

I definitely agree to include the table name in the identifier name field, precisely for the reasons you give. This is usually the only field where I would include the table name.

0
Oct. 16 '08 at 13:39
source share

I prefer the domain name || 'I'D'. (i.e. domain name + identifier)

DomainName often, but not always, matches the name TableName.

The problem with the identifier itself is that it does not scale up. When you have about 200 tables, each of which has the first column named ID, the data starts to look the same. If you always qualify an ID with a table name, this helps a little, but not so much.

DomainName and ID can be used to designate foreign keys as well as primary keys. When foriegn keys are named after the column that they refer to, this can be of mnemonic help. Formally, the binding of the name of a foreign key to the key that it refers to is not required, since the restriction of referential integrity will establish the link. But it is very convenient when it comes to reading requests and updates.

Sometimes, DomainName || "ID" cannot be used because there will be two columns in the same table with the same name. Example: Employees.EmployeeID and Employees.SupervisorID. In such cases, I use RoleName || 'ID', as in the example.

Last, but not least, I use natural keys instead of synthetic keys. There are situations when natural keys are unavailable or unreliable, but there are many situations where a natural key is the right choice. In those cases, I let the natural key take the name that it naturally would have. This name often does not even have the letters "ID" in it. Example: OrderNo, where No is the abbreviation for "Number".

0
Oct 17 '08 at 10:57
source share

For each table, I select a shortened tree letter (e.g. Employees => Emp)

Thus, the autonumber numeric primary key becomes nkEmp .

It is short, unique in the entire database, and I definitely know its properties at a glance.

I keep the same names in SQL and all the languages ​​that I use (mostly C #, Javascript, VB6).

0
Oct 17 '08 at 13:58
source share

See the Interakt naming convention site for a well-designed system for naming tables and columns. The method uses a suffix for each table ( _prd for the product table or _ctg for the category table) and is added to each column in this table. Thus, the identifier column for the product table will be id_prd and therefore unique in the database.

They go even further to help understand foreign keys: the foreign key in the product table that refers to the category table will be idctg_prd , so it’s obvious which table it belongs to ( _prd suffix) and which table it belongs to (category).

, , , .

0
17 . '08 14:36
source share

. /

0
14 . '11 0:53
source share

. , .

  • (3-4 ) , - - inv , InvoiceLines - invl
  • , , .. inv_id , invl_id
  • invl_inv_id .

,

 SELECT * FROM Invoice LEFT JOIN InvoiceLines ON inv_id = invl_inv_id 
-2
16 . '08 13:39
source share



All Articles