State and Explain 1 NF and 2 NF with example.
Of course. Here is a clear explanation of First Normal Form (1NF) and Second Normal Form (2NF), complete with their rules and step-by-step examples.
Normalization is the process of organizing a database to reduce data redundancy and improve data integrity. It involves a series of guidelines called Normal Forms. A database is said to be in a certain normal form if it satisfies the rules of that form.
A table is in First Normal Form (1NF) if it satisfies two conditions:
1. All columns contain only atomic (indivisible) values.
2. Each row is unique (which is typically handled by having a primary key).
In simple terms, "one value per cell." This means no repeating groups or multi-valued columns, like comma-separated lists or nested tables.
The main goal of 1NF is to eliminate repeating groups of data. If you have a cell that contains a list of items, it violates 1NF because that value is not atomic; it can be broken down further. This is problematic because it makes searching, sorting, and indexing the data extremely difficult and inefficient.
Let's consider a table that tracks students and the courses they are enrolled in.
Table Not in 1NF (Bad Design):
This table has a Courses
column that stores a list of course codes for each student.
| StudentID | StudentName | Courses |
| :--- | :--- | :--- |
| 101 | Alice | "CS101, MTH203" |
| 102 | Bob | "PHY101" |
| 103 | Charlie | "CS101, ENG101, ART304" |
Problem:
The Courses
column is not atomic. It contains multiple values in a single cell.
How do you find all students taking "CS101"? You would have to search inside a text string, which is slow and error-prone.
* How do you add a new course for Alice? You have to read the string, append to it, and write it back. This is complex.
How to Achieve 1NF:
To fix this, we create a separate row for each student-course combination. This eliminates the multi-valued Courses
column.
Table in 1NF (Good Design):
| StudentID | StudentName | CourseCode |
| :--- | :--- | :--- |
| 101 | Alice | CS101 |
| 101 | Alice | MTH203 |
| 102 | Bob | PHY101 |
| 103 | Charlie | CS101 |
| 103 | Charlie | ENG101 |
| 103 | Charlie | ART304 |
Now, every cell contains a single, atomic value. The table is in 1NF. However, this introduces a new problem: data redundancy (StudentName
is repeated), which 2NF will address.
A table is in Second Normal Form (2NF) if it satisfies two conditions:
1. It is already in 1NF.
2. It has no partial dependencies.
A partial dependency occurs when a non-key attribute is dependent on only a part of the composite primary key, instead of the entire key. This rule is only relevant for tables that have a composite primary key (a primary key made of two or more columns).
The goal of 2NF is to ensure that every non-key column in a table is there to describe the entire primary key. If a column only describes one part of a multi-column key, it's in the wrong table and should be moved.
Let's continue with our 1NF table. To make each row unique, the primary key must be a combination of StudentID
and CourseCode
.
Primary Key: (StudentID
, CourseCode
)
Table in 1NF, but Not in 2NF (Bad Design):
| StudentID (PK) | CourseCode (PK) | StudentName | Grade |
| :--- | :--- | :--- | :--- |
| 101 | CS101 | Alice | A |
| 101 | MTH203 | Alice | B |
| 102 | PHY101 | Bob | B |
Problem:
Let's analyze the dependencies:
Grade
: Depends on both StudentID
and CourseCode
. (You need to know the student AND the course to know the grade). This is a full dependency - which is good.
StudentName
: Depends only on StudentID
. It has nothing to do with CourseCode
. This is a partial dependency, and it violates 2NF.
This partial dependency causes anomalies:
Update Anomaly: If Alice changes her name, you must update it in multiple rows. If you miss one, the data becomes inconsistent.
Insertion Anomaly: You cannot add a new student (e.g., Student 104, "David") until they enroll in at least one course, because you need a CourseCode
to complete the primary key.
* Deletion Anomaly: If Bob drops his only course (PHY101), his entire record, including his name, will be deleted from this table.
How to Achieve 2NF:
To fix this, we decompose the table. We remove the partial dependency by creating a separate table for the data that is only partially dependent.
Tables in 2NF (Good Design):
We split the original table into two smaller tables:
1. Students
Table: This table holds the information that depends only on StudentID
.
| StudentID (PK)| StudentName |
| :--- | :--- |
| 101 | Alice |
| 102 | Bob |
2. Enrollment
Table: This table holds the information that depends on both StudentID
and CourseCode
.
| StudentID (PK, FK) | CourseCode (PK, FK) | Grade |
| :--- | :--- | :--- |
| 101 | CS101 | A |
| 101 | MTH203 | B |
| 102 | PHY101 | B |
Now, both tables are in 2NF. The Students
table has a single-column primary key, so it has no partial dependencies. In the Enrollment
table, the only non-key attribute, Grade
, is fully dependent on the entire composite key (StudentID
, CourseCode
). The anomalies are now resolved.