Normalize the table to the 3rd normal form

These questions are obviously homework. I cannot understand my professor and I do not know what he said during the election. I need to do step-by-step instructions to first normalize the next table to 1NF, then 2NF, then 3NF.

enter image description here

I appreciate any help and guidance.

+7
source share
4 answers

Well, I hope that I remember everything correctly, let it begin ...

rules

To make them very short (and not very accurate, just to give you the first idea that this is all):

  • NF1 : the table cell must not contain more than one value.
  • NF2 : NF1 plus all columns of a non-primary key must depend on all columns of the primary key.
  • NF3 : NF2 plus non-primary key columns can be independent of each other.

Instructions

  • NF1 : Find table cells containing more than one value, put them in separate columns.
  • NF2 : Find the columns based on the smaller number of all the primary key columns, put them in another table that contains only the primary key columns on which they really depend.
  • NF3 : Find columns that depend on other non-primary key columns, in addition to the primary key dependency. Place the dependent columns in another table.

Examples

Nf1

the state column has values ​​like WA, Washington. NF1 is violated because these are two meanings, an abbreviation and a name.

Decision. To execute NF1, create two columns, STATE_ABBREVIATION and STATE_NAME .

NF2

Imagine you have a table with these four columns, expressing the international names of car models:

  • COUNTRY_ID (numeric, primary key)
  • CAR_MODEL_ID (numeric, primary key)
  • COUNTRY_NAME (varchar)
  • CAR_MODEL_NAME (varchar)

The table may contain the following two rows of data:

  • Line 1: COUNTRY_ID = 1, CAR_MODEL_ID = 5, COUNTRY_NAME = USA, CAR_MODEL_NAME = Fox
  • Line 2: COUNTRY_ID = 2, CAR_MODEL_ID = 5, COUNTRY_NAME = Germany, CAR_MODEL_NAME = Polo

This suggests that the Fox model is called the Fox in the US, but the same car model is called the Polo in Germany (don't remember if this is true).

NF2 is violated because the country name does not depend on the vehicle model identifier and country identifier, but only on the country identifier.

Decision. To execute NF2, move COUNTRY_NAME to a separate table, COUNTRY, with columns COUNTRY_ID (primary key) and COUNTRY_NAME . To get a set of results, including the name of the country, you will need to connect two tables using JOIN.

Nf3

Let's say you have a table with these columns expressing the climatic conditions of the states:

  • STATE_ID (varchar, primary key)
  • CLIME_ID (foreign key, identifier of the climate zone, such as "desert", "rainforest", etc.).
  • IS_MOSTLY_DRY (bool)

NF3 is violated because IS_MOSTLY_DRY depends only on CLIME_ID (even if we suppose that), but not on STATE_ID (primary key).

Solution: execute NF3, put the MOSTLY_DRY column in the climate zone table.


Here are some considerations regarding the actual table in the exercise:

I apply the above NF rules without challenging the primary key columns. But in reality they do not make sense, as we will see later.

  • NF1 is not violated, each cell contains only one value.
  • NF2 is violated by EMP_NM and all phone numbers, since all of these columns are independent of the full primary key. All of them depend on EMP_ID (PK), but not on DEPT_CD (PK). I assume that the phone numbers remain the same when an employee moves to another department.
  • NF2 is also violated by DEPT_NM, because DEPT_NM is independent of the full primary key. It depends on DEPT_CD, but not on EMP_ID.
  • NF2 is also violated by all skill columns because they are not departments, but only specific employees.
  • NF3 is violated by SKILL_NM because the skill name depends only on the skill code, which is not even part of the composite primary key.
  • SKILL_YRS violates NF3 because it depends on the primary key element (EMP_ID) and the primary key member (SKILL_CD). Therefore, it partially depends on the attribute of the non-primary key.

So, if you delete all columns that violate NF2 or NF3, only the primary key remains (EMP_ID and DEPT_CD). This remainder violates these business rules: this structure will allow the employee to work in several departments at the same time.

Let's look at it from afar. Your data model relates to employees, departments, skills, and the relationships between these entities. If you normalize this, you will get one table for employees (containing DEPT_CD as a foreign key), one for departments, one for skills and another for the relationship between employees and skills, with a “skill” of years for each tuple EMP_ID and SKILL_CD (mine the teacher would call the latter "associative entity").

+18
source

Look at the first two rows in your table, and see which columns are labeled “PK” in this table, and assuming that “PK” means “Primary Key”, looking at the values ​​that appear for these two columns in these two rows ,
I would recommend your professor to leave training in the database and not return until he received the proper education on this issue.

This exercise cannot be taken seriously, because the problem statement itself contains hopelessly conflicting information.

(Note that as a result, there simply isn’t such a thing as a “good” or “right” answer to this question !!!)

+2
source

Another simplified answer.

In the 3NF relational table, each keyless value is determined by a key, an integer key, and nothing but a key (so help me Codd;)).

1NF: key. This means that if you specify a key value and a named column, there will be at most one value at the intersection of the row and column. Ambiguous, like a series of values ​​separated by commas, is forbidden because you cannot directly go to the value using the key name and acolumn name.

2NF: all key. If a column that is not part of the key is determined by the corresponding subset of the key columns, then 2NF is violated.

3NF: And nothing but a key. If a column is determined by some set of non-key columns, then 3NF is violated.

+1
source

3NF satisfies only if it is in 2nd normal form and has no transitive dependency, and all non-key attributes must depend on the primary key.

Transitive dependence: R = (A, B, C). A-> B and B-> C THEN A-> C

0
source

All Articles