Normalization

 

Normalization

Ø  It is process of minimize the redundancy. Redundancy means duplication of data.

Ø  It divides larger table into smaller tables and links them using relationship.

Ø  Example

Empid

Ename

Sal

Deptname

Location

101

Ramu

10000

IT

Hyd

102

Kamal

14500

Account

Mumbai

103

Satish

15000

Account

Mumbai

104

Praveen

16500

IT

Hyd

 

Empid

Ename

Sal

Deptno

101

Ramu

10000

10

102

Kamal

14500

20

103

Satish

15000

20

104

Praveen

16500

10

                                        

Deptno

Deptname

Location

10

IT

Hyd

20

Account

Mumbai

 

Ø  It removes anomaly, these are insert anomaly, update anomaly, delete anomaly.

Insert anomaly: - Repetition of data and if we insert employee who does not allotted dept. this could be insertion anomaly. Because data which is not having dept details, after knowing dept for those employees it is difficult to insert all rows so insertion anomaly is formed. To reduce it we can use normalization.  

Update anomaly: - in the above base table if Deptname is changed for employees. So, it is difficult to update all thousands of rows. So update anomaly is formed, to reduce it we can use normalization.    

Delete anomaly: - In the above table if employees are transfer or promoted or resign, if we delete that employee row all data deleted with dept details also. So here delete anomaly is formed.

Ø  Normal forms

1.      1NF

2.      2NF

3.      3NF

4.      BCNF

5.      4NF

6.      5NF

7.      6NF

Ø  1NF: - 1st normal form contains set of rules these are

·        Every record contains only single values means atomic values.

·        Column data type should not change, if we convert to new table.

·        Unique name of each column means no two columns have same name.

·        No need to follow order if new table is created.

Empid

Ename

Sal

Deptname

101

Ramu

10000

IT, Manager

102

Kamal

14500

Account

103

Satish

15000

Account

104

Praveen

16500

IT

We want to change this table as

Empid

Ename

Sal

Deptname

101

Ramu

10000

IT

101

Ramu

10000

Manager

102

Kamal

14500

Account

103

Satish

15000

Account

104

Praveen

16500

IT

 

Ø  2NF: - 2nd normal form is having set of rules these are

·        The table must be in 1NF.

·        No Partial dependency.

Student ID

Student Name

Phone number

1

Kamal

1236549870

2

Ramesh

1452369870

3

Sai

1365984125

Subject ID

Subject Name

100

Maths

101

Physics

                   


Marks ID

Student ID

Subject ID

Marks

Teacher

1000

1

100

50

Raju

1001

2

101

68

Ravi

1002

3

100

85

Raju

 

In the above table Marks ID, Student ID, Subject ID are Primary keys. These are three table student table, subject table, marks table. In the marks table Teacher column is dependent on only subject ID primary key column so it is called Partial dependency.

Dependency means all non-key columns must dependent on primary key or candidate key it is aloso called as functional dependency.

So, in the above tables are marks table should not follow 2NF rules so we changed that tables to as below.

Marks ID

Student ID

Subject ID

Marks

1000

1

100

50

1001

2

101

68

1002

3

100

85

 

Subject ID

Subject Name

Teacher

100

Maths

Raju

101

Physics

Ravi

 

 

 

 

 

Ø  3NF: - 3rd normal form is having set of rules these are

·        The must qualify in 2NF.

·        No Transitive dependency.

Transitive dependency means when one non-primary key column depends on another non-primary key columns. Basically, non primary key column also contains null values also so one column is dependent on another null column so it causes error.

 

Marks ID

Student ID

Subject ID

Marks

Examination

Total Marks

1000

1

100

15

Internal

20

1001

2

101

58

External

80

1002

3

100

68

External

80

Here we observe that total marks and examination are non- primary keys, and total marks dependent on Examination so here Transitive dependency is formed. Then we remove this table and create two tables as

 

Marks ID

Student ID

Subject ID

Marks

Examination ID

1000

1

100

15

21

1001

2

101

58

22

1002

3

100

68

22

 

Examination ID

Examination

Total Marks

21

Internal

20

22

External

80

 

Ø  BCNF: - Boyce Codd Normal form is having set of rules these are

·        The table must be in 3NF.

·        If one column is dependent on second column, the 1st column must be a super key.

It is advanced form of 3NF so it is called as 3.5NF.

Super key means either it is primary key or foreign key.

 

Student ID

Subject

Teacher

1

Maths

Arun

1

Physics

Babu

2

Maths

Chakri

 

 

 

 

Here student id and subject both have composite primary key. Because student enroll for any no of tuition subject and any one subject have more than one student so it is many to many relations that’s why we are given composite primary for both combinations. Teacher is non key attribute.

Here teacher is determining which subject is teaching. It means TeacheràSubject. But teacher is not a super key, so it need normalization.

 

Student ID

Teacher ID

1

21

1

22

2

23

Teacher ID

Teacher

Subject

21

Arun

Maths

22

Babu

Physics

23

Chakri

Maths

                            


   

Ø  4NF: - 4th normal form is containing set of rules these are

·        The table must be in BCNF.

·        The table should not have multi-valued dependency.

Multi-valued dependency means if a table contain three columns, 1st column of a table determines 2nd and 3rd column this relation is called multi-valued dependency. And also 2nd and 3rd columns independent of each other.

 

Student ID

Student name

Fav Subject

Fav Game

1

Kamal

Maths

Cricket

1

Kamal

Physics

Kabaddi

2

Raj

Social

Cricket

 

 

 

 

Here student have multiple favorite subjects as well as favorite games. So, one person is determining multiple values so it is called multi-valued dependency.   

So we can change this table as

Student ID

Fav Subject

1

Maths

1

Physics

2

Social

Student ID

Fav Game

1

Cricket

1

Kabaddi

2

Cricket

 

Student ID

Student name

1

Kamal

2

Raj

 

 

 

 

Here student ID and Fav subject are composite primary key and also student ID and Fav game also composite primary key. So no duplicate values are enter here.

Comments

Popular posts from this blog

PLSQL CURSOR