What is the best database schema to support values ​​that are only suitable for specific rows?

I have a db table named Calendar with fields

  • Id (PK)
  • Name
  • Description
  • CalendarTypeId (FK to CalendarType table)

I have another table called CalendarType with fields

  • Id (PK)
  • Name
  • Description

The problem is that I need to save an additional field for each calendar, where the type of calendar is 2. (but this field does not matter for any other type of calendar).

Should I just create a new field in the Calendar table and ignore this field for all other calendars that have a different calendarTypeid type, or is there a better way to organize this scheme to support this need.

+12
database-design database-schema
Dec 31 '11 at 1:23
source share
5 answers

Ok, this is an ER model of what you have (lack of power):

Now let's focus on Calendar and SubCalendar. Clearly, you have a hierarchy. But how do hierarchies turn into tables? There are three general ways to do this:

1) Kill the parent and save the children . In this case, you delete the parent object and send all fields from this object to each of these children. In your example, you only have one child, so all parent attributes will be available only to him.

Advantages: No null values, since each table will have everything you need. No associations are required. If you run queries for only one type of children, this scheme will be useful because you will not need to filter by type, because each table will store only one type

Disadvantages: this scheme is not suitable for cases when you have overlapping children. In other words, if the parent row can have more than one child when sending fields to each child, the parent data will be duplicated for each child. Not good, so do not use this strategy if it is. In addition, if you have many children and very few records in each, you will have many tables with several records each, so it can become a little more difficult to manage

2) Kill the children and save the parent . In this case, you delete all children and send all your attributes to the parents. Since the parent is now a mixture of himself and all his children, he needs a way to determine which line belongs to that type of children. This is achieved by adding a new attribute to the parent object, which will determine the type of each row (regardless of the data type).

Advantages: for all children there will be only one table, so it is easy to manage. No associations are required. It may be useful if most of the queries that are performed in this table require results from more than one type of children.

Disadvantages: Again, if the parent can have a row related to several child data, it will be duplicated, since there will be one row for each of them, therefore there is a restriction in this solution. In addition, a new column must be added to the metadata. The volume of entries in the table will be more. Zero values ​​should be tied to the data that children have, and to parents or other children.

3) Save everything : the least bloody decision is not to kill anything :) In this case, the hierarchy is replaced by the relationship between the parent and each of them. Thus, the child will need to connect to the parent table using a foreign key to access the parent data.

Advantages: No data duplication or null values. Each object has only a minimal amount of data, and the rest can be obtained by joining the parent table. In this case, the parent row can be associated with several children without duplication of data. If you run many queries that can be satisfied with only one table (usually the parent), this is a good option. Another thing is that it can be easily expanded to more calendars, for example, if you need to add a new calendar that requires new fields, you need to add a new table without changing the current

Disadvantages: Most tables are required (in fact, one is larger than the first). For each child, a connection is required that will degrade performance, the larger the data set. In addition, foreign keys are required to join both tables. If most queries require data from parents and children, this scheme will be the worst in terms of performance

Now you asked which database schema is best . I think it’s now clear that this depends on the requirements, the types of queries that will be executed, how the data is structured, etc.

However, I can analyze this a little more. You said that you have a calendar table, and sometimes more data is required for each of them. So, we can say that we have 2 types of calendars, parent and child. Therefore, we might think that switching to solution 2 is a good opportunity, because you will have 2 lines representing each type, but we were wrong. This is due to the fact that in this case, each child includes a parent. Now, if we can assume that if SubAttribute always non-empty for the child and null for the parent, we will even remove CalendarType , which will actually lead to solution 1.

Finally, as a rule (mainly because most queries have many associations in real life), if you want to focus on performance, you should go to solution 1, otherwise, if you want to focus on normalization, you should go to solution 3 .

I hope this resolves some doubts and may have caused others :)

+15
Feb 27 2018-12-12T00:
source share

I would probably use Calendar. I call this Db table overloading. When data storage was expensive, it was a crime. Now he called the solution to the problem a simple way and movement. Never over engineer until you need it.

However, you did not explicitly indicate whether the value of the additional field for each Calendar instance has changed with the type identifier from 2. Sometimes in my type tables there are subtype fields, etc., but I assume that this is the case when the type 2 WILL calendar instances have different values ​​in the required field.

+4
Dec 31 '11 at 1:37
source share

Maybe I look at it too simply, but if you stick to the “use before reuse” model, then the right thing is to simply add a column with a null value to your calendar table and add a check constraint back to the calendar type so that it is not null. if calendar type = 2.

This is straightforward, and most importantly, easy to verify.

I could thank a little for this answer (not the most effective, probably), but it completely depends on the scale of your decision. The reality is that these restrictions can change very well in the next couple of months, and you don’t want to draw yourself into a corner, choosing the “right” way when you still don’t know what it is. It is possible that when you go to the 10th type of calendar, a template will appear that really tells you the best (or most ordinary) way to do this. For now, just keep it simple and simplify testing and easily change it later.

+3
Mar 04 '12 at 4:14
source share

You can use a single table inheritance pattern that is close to your suggestion,

http://martinfowler.com/eaaCatalog/singleTableInheritance.html

or

http://martinfowler.com/eaaCatalog/classTableInheritance.html

if you want to specialize some tables to match the types (Calendar and CalendarType2) that you are trying to present in your database

+2
Dec 31 '11 at 1:31
source share

Leora

I would recommend that you use a calendar table and null additional fields that are not required for other types of calendar. As requirements change, you can add additional attributes to the calendar table this way.

I would also recommend having a base calendar class for your model, and then subclassing mapped using the calendartypeid field and using specific calendar subclasses in your application as needed. Most ORMS will support this type of mapping, and also allow you to make each subclass different from the others, if necessary.

Stephen

+1
Feb 27 '12 at 8:16
source share



All Articles