Short Notes on Normalization:
The process of organizing data model to efficiently store data and to minimize redundancy.
First Normal Form:
A table is said to be in 1st normal form if it has no repeating or duplicate fields. Each record is uniquely identified by a primary key.
For example: Observe the below table of data. This is not in first normal form because of
- Multiple items in color field
- Duplicate records
- No primary key.
Item Name | Color | Price | Tax |
T-shirt | Red, blue | 50 | 4.00 |
Polo | Blue, yellow | 25 | 2.00 |
T-shirt | Red, blue | 50 | 4.00 |
Sweater | Blue, Rose | 100 | 10.00 |
Second Normal Form:
All non-key fields depend on all components of the primary key. Here price & tax depend on item but not color.
Table 1:
Item | Color |
T-Shirt | red |
T-Shirt | blue |
Polo | red |
Sweater | blue |
Sweater | black |
Table 2:
Item | Price | Tax |
T-Shirt | 50 | 4 |
Polo | 25 | 2 |
Sweater | 100 | 10 |
Third Normal Form:
No non key field depends upon another. All non-key fields depend only on the primary key. Remove columns that are not fully dependent on the primary.
In 2nd Normal Form, tax depends on price, not item.
Multivalued dependency: Each and every attribute within a relation depends upon the other. Yet none of them is a unique primary key.
Example: Consider Vendor supplying many items to projects in an organization.
Vendor Code | Item Code | Project No. |
V1 | 11 | P1 |
V1 | 12 | P1 |
V1 | 11 | P3 |
V2 | 12 | P3 |
V2 | 13 | P1 |
V3 | 11 | P1 |
V3 | 11 | P2 |
V3 | 11 | P2 |
- One Vendor can supply many items.
- Project uses many items
- Vendor can supply to many projects
- Item can be supplied by many vendors.
Multivalued dependency exists here because all attributes depend upon another and yet none of them is primary key.
BCNF (Boyce-Codd Normal Form): A relation is said to be BCNF if every determinant is a candidate key.
A row is in BCNF if and only if every determinant is a candidate key. The second and third normal forms assume that each one attribute is not a part of the candidate keys depend upon the candidate keys however does not deal among the keys.
BCNF differs from the 3NF only if there are more than one candidate keys and also the keys are composite and co-inside, consider for example, the relationship enroll (sno, sname, cno, cname, date-enrolled)
Forth normal form
Fourth Normal Form:
4NF needs that there should be no non-trivial multivalued dependencies of attribute sets on something apart from super set of a candidate key.
A table is said to be in 4NF if and only if it is within the BCNF and multivalued dependencies are functional dependencies. The 4NF removes unwanted information structures- multivalued dependencies.
(Non-trivial functional dependencies are avoided).
Table 1:
Vendor Code | Item Code |
V1 | 11 |
V1 | 12 |
V2 | 12 |
V2 | 13 |
V3 | 13 |
Table 2:
Vendor Code | Project |
V1 | P1 |
V1 | P3 |
V2 | P1 |
V2 | P3 |
V3 | P2 |
Example 2: Consider Employees, Skills and Languages.
Here we have many to many relationship between
Employee – Skills.
Employee – Languages.
So in 4thnormal form we cannot respect their relationship in a single record such as
Employee can have multiple skills & can speak multiple languages..
Emp. | Skills | language |
Employee can have multiple skills
Emp. | Skills |
Employee can know multiple languages.
Emp. | Languages |
Short Notes on Normal Forms:
1NF (First Normal Form): each column type should be unique.
2NF (2nd Normal Form): Already in 1NF and all attributes within the entity should depend exclusively on the entity’s unique identifier.
3NF (3rd Normal Form): The entity should already be in 2NF and no column entry should be dependent on any other value other than the key for the table. If such an entity exists, move it outside into a new table.
Now If these 3NF achieved the database is considered normalized. but there are three more ‘extended’ NF for the moralist.
They are:
BCNF (Boyce -Codd Normal Form): The database should be in 3NF and all tables will have only one primary key.
4NF (4th Normal Form): No multi-valued dependencies on a primary key.
5NF (5th Normal Form): There should be no cyclic dependencies in a composite key.
Normalization Questions and Answers – Normal Forms
Q.1) A relation scheme is said to be in …………….. form if the values in the domain of each attribute of the relation are atomic.
a) Un-Normalized.
b) First Normal form.
c) Boyce Codd.
d) None of those.
Q.2) A second normal form does not allow ………………….. Dependency between a non-prime attribute and also the relation key.
a) Partial
b) Multi
c) Functional
d) Valued.
Q.3) A relation scheme is in ………………….. If it is in the 1NF and if all non-prime attributes are fully functionally dependent on the relation key.
a) 1NF
b) 2NF
c) BCNF
d) 4NF (Fourth Normal Form).
Q.4) 5NF is related to ……………..
a) Functional Dependency
b) Multivalued Dependency
c) Join Dependency
d) None.
Q.5) Project join normal form is also referred to as …………..
a) 2NF
b) 3NF
c) 4NF
d) 5NF.
Q.6) which of the following is not included in DML (Data Manipulation Language).
a) INSERT
b) UPDATE
c) DELETE
d) CREATE.
Q.7) The relational model includes the 2} general integrity rules those two are ……….. And ………………
a) Primary key, Foreign key
b) Composite Key, Primary key
c) Transaction key, foreign key
d) Foreign key, primary key.