Database design efficiency from 1 to many relationships is limited to 1-3

This is in mysql, but the problem is with the database design. If you have a relationship with each other, such as a client of the bank to bank accounts, usually you will have a table in which information about the bank account is recorded, there is a foreign key that tracks the relationship between the account and the client. Now this follows the third normal form and is a generally accepted way of doing it.

Now let's say that you are going to limit the user to only three accounts. The current database implementation will support this, and nothing should change. But another way to do this would be to have 3 columns in the accounts table, in which there is an identifier from 3 corresponding accounts. By the way, this violates the 1st normal form of db design.

The question is, what would be the advantage and disadvantages in that the relationship of the user account was thus restored in comparison with the traditional one?

Update

Unfortunately, I am not responsible for the db design. When I saw a similar relationship implemented in our db, I asked my boss, db designer, why he decided to do it that way. I really did not receive a pouring answer or an answer which, as I understood it, had logical reasons for supporting it. "This is a very general relationship when you work with databases, and that’s exactly how you do it." I asked for more clarification .... He didn’t get me anywhere and made him defensive.

Thank you very much for this post, I could not find a single book that even talked about doing something like this. I found that many books tell me to do it right, but not many give an example of this, and then explain why it sucked.

+4
source share
8 answers

The biggest problem is that your queries are becoming more complex. Let's say you want to find all accounts with a balance of more than $ 10,000 with an owner. In a normalized database, it will be something like:

select firstname, lastname, accountnumber, balance from account join customeraccount using (accountnumber) join customer using (customernumber) where balance>10000 

But with three column fields, it becomes

 select firstname, latname, accountnumber, balance from account join customer on customer.accountnumber1=account.accountnumber or customer.accountnumber2=account.accountnumber or customer.accountnumber3=account.accountnumber where balance>10000 

Each request that joins the account for the Client now becomes more complex.

Sooner or later, someone will write a request that cannot verify the account number3, or he tries to perform three tests by copying and pasting, and after copying the account number 1 twice, he forgets to change one of them. This is a mistake that is easy to notice when reading a request. If you ruin one of the three comparisons, the program will work for all customers who have only two accounts, but not for customers who have three. This is a problem that can easily test testing.

Now you need to think about how connections work when the same client has multiple accounts. If a client has two qualification accounts in any request, do you want him to be displayed once or twice?

Probably, you need to indicate in the field the account number in the client. Now you need three indexes instead of one. Additional overhead for the database.

Are you sure the maximum will never change? Because, if it is ever, now each of these requests that check three slots must be changed to check four slots. This can be a ton of work.

What do you get in exchange for all this pain? Automatic enforcement of max-3 limit. Another table. You might get better performance on some queries because there is one smaller table that needs to be joined. On the other hand, you may not get better performance, depending on many details of the internal operation of the database engine and the actual data.

Everyone said, I would say that this is almost not worth doing. Stick with a normalized database.

I speak from experience. I did something very similar to this once. We had a database in which we had to write down three types of “managers” for each book published by our organization (No. 1, responsible for budgeting and administration, No. 2, responsible for distribution, and No. 3, responsible for content (i.e. e. Editor) Since the three were different, I created three separate points. A huge mistake. It would be much better for me to create a book manager table with a type code and apply only one of each type with triggers or code. (Experience allows you to make the right decisions. Experience is achieved by making bad decisions.)

+5
source

Well, first of all, you will have many empty fields for records for customers with less than three accounts.

Adding a fourth or more accounts will require adding columns to the table, which will result in more blank fields for each record.

Secondly, it will be easier to request data (for example, the total number of accounts) if they are stored in a separate table.

The reason we use separate tables for relationships from 1 to N is because it will save you headaches like these along the line.

+5
source

<strong> Benefits:

  • Faster than normal form (how much?)
  • Simplified queries for basic operations (no connections)
  • Maximum limits are a little easier.

Disadvantages:

  • extensibility
  • Added business logic (what if a customer closes their first account?) Shift others?)
  • Empty space (significant if the average user does not have 3 accounts)
  • Aggregates are harder to obtain (i.e., the exact total number of accounts)
  • You cannot demand normalization of your database

Both options are valid subject to requirements. If possible, compare the difference and see how high the performance is and calculate the storage difference to see if it is worth your deployment.

However, I would prefer to use a trigger to impose an account restriction, since this would provide the easiest maintainability, and not on free disk space, and future developers will not be surprised why I can not even get 1NF to the right.

+3
source

Advantage: high performance, no binding to the relationship table

Disadvantage: it violates the 1st normal form (but can be violated to improve performance)

It is for you;)

+1
source

Now let's say that you are going to limit the user to only 3 accounts.

This includes a magic number that can be changed.

Do not believe anyone who says that there is a “limit”. Today, the absolute maximum will be the minimum tomorrow.

Do not participate in providing this nonsense in the database. All “limits” are nothing more than “typical values ​​for the current application” and will change.

Use normal 1-to-many with normal foreign keys and ignore magic number "3".

+1
source

The three-column approach suffers if there is a change in business rule (IE: now users can have FOUR accounts, ah ah ah ...). This will require an ALTER table vs. INSERT statement, and all supporting logic needs to be reviewed to accommodate the new column (s) - very costly in terms of development. In addition, databases have a limitation when 3NF is not affected.

+1
source

Possible disadvantages of splitting account numbers into multiple columns:

You will need to repeat the logic of the request and any other code that refers to the account number.

You would need to use some placeholder instead of an account number if there were less than three.

This is not automatically a violation of the first normal form, because 1NF is nothing more than a definition of the relation itself. This is just not very practical design ( DRY ).

0
source

it sounds like you were all drowning in a glass of water! .. why not just add a column-column to each detailed account table, and if you try to add a new account row, check the counter?

0
source

Source: https://habr.com/ru/post/1312893/


All Articles