SQL Normalization
When databases are designed without proper structure, they often suffer from serious problems. Data gets repeated, updates become risky, and even small changes can introduce inconsistencies. Over time, such poorly designed tables become difficult to maintain and error-prone.
To solve these issues, database designers follow a systematic process called Normalization.
Normalization organizes data in such a way that redundancy is reduced, integrity is improved, and maintenance becomes easier.
What Is Normalization?
Normalization is the process of:
- Structuring data efficiently
- Eliminating unnecessary duplication
- Ensuring logical data dependencies
In simple words:
Normalization helps store data in the right place, in the right form.
Problems Without Normalization
Poorly designed tables lead to:
- Data redundancy
- Update anomalies
- Insert anomalies
- Delete anomalies
These problems are exactly what normalization aims to remove.
First Normal Form (1NF)
Rule of 1NF: Atomic Values
A table is in First Normal Form (1NF) if:
- Each field contains atomic (indivisible) values
- No repeating groups or multi-valued columns exist
- Each record can be uniquely identified
Example: Non 1NF Table
| student_id | name | subjects |
|---|---|---|
| 1 | Rahul | Math, Physics |
The subjects column contains multiple vlaues.
Converted to 1NF
| student_id | name | subject |
|---|---|---|
| 1 | Rahul | Math |
| 1 | Rahul | Physics |
Key Points of 1NF
- No comma-separated values
- One value per cell
- Foundation of normalization
Second Normal Form (2NF)
Rule of 2NF: Remove Partial Dependency
A table is in Second Normal Form (2NF) if:
- It is already in 1NF
- No non-key attribute depends on part of a composite primary key
Example: Partial Dependency Problem
| order_id | product_id | product_name |
|---|---|---|
| 101 | 100 | phone |
Here:
- Primary key = (order_id, product_id)
- product_name depends only on product_id
Converting to 2NF
Orders Table
| order_id | product_id |
|---|---|
| 101 | 100 |
Product Table
| product_id | product_name |
|---|---|
| 100 | phone |
Key Points of 2NF
- Applies only to composite keys
- Eliminates partial dependency
- Improves data clarity
Third Normal Form (3NF)
Rule of 3NF: Remove Transitive Dependency
A table is in Third Normal Form (3NF) if:
- It is in 2NF
- No non-key attribute depends on another non-key attribute
Example: Transitive Dependency
| emp_id | emp_name | dept_id | dept_name |
|---|---|---|---|
| 100 | Nikhil | 101 | Developer |
Here:
- dept_name depends on dept_id
- dept_id depends on emp_id
Converting to 3NF
Employees Table
| emp_id | emp_name | dept_id |
|---|---|---|
| 100 | Nikhil | 100 |
Departments Table
| dept_id | dept_name |
|---|---|
| 100 | Developer |
Key Points of 3NF
- Eliminates indirect dependencies
- Makes data modular
- Most commonly used normal form
Boyce-Codd Normal Form (BCNF)
Rule of BCNF: Stronger 3NF
BCNF is an advanced version of 3NF.
A table is in BCNF if:
- It is in 3NF
- Every determinant is a candidate key
Example: BCNF Violation
| course | teacher | room |
|---|---|---|
| SQL | Nikhil | 101 |
Assume:
- A teacher teaches only one course
- Course uses multiple rooms
Here:
- teacher - course
- teacher is not a candidate key
Converting to BCNF
Teacher_Course Table
| teacher | course |
|---|---|
| Nikhil | SQL |
Course_Room Table
| course | room |
|---|---|
| SQL | 101 |
Key Points of BCNF
- Handles complex dependencies
- Removes subtle anomalies
- Used in advanced designs
Normal Forms Comparison
| Normal Form | Focus | Eliminates |
|---|---|---|
| 1NF | Atomic values | Repeating groups |
| 2NF | Full dependency | Partial dependency |
| 3NF | Direct dependency | Transitive dependency |
| BCNF | Key dependency | Hidden anomalies |
Should Everything Be Normalized?
While normalization improves structure, over-normalization can:
- Increase joins
- Reduce performance
- Complicate queries
In real systems:
- 3NF is usually sufficient
- BCNF is applied selectively
- Sometimes denormalization is used for performance
Real-World Use Cases
Normalization is essential in:
- Banking databases
- Student management systems
- HR systems
- Inventory and order management
Two Minute Drill
- Normalization reduces redundancy
- 1NF removes multi valued columns
- 2NF removes partial dependency
- 3NF removes transitive dependency
- BCNF strengthens key rules
- 3NF is most commonly used