SQL Indexes
As databases grow, tables can contain:
- Thousands of rows
- Millions of records
- Data spread across multiple columns
When a query searches for specific data without an index, the database must:
- Scan every row
- Compare values one by one
- Take more time as data increases
This process is called a full table scan, and it becomes expensive on large tables. To solve this problem, databases use indexes.
What Is an Index in SQL?
An index is a database object that:
- Improves data retrieval speed
- Works like an index in a book
- Allows the database to find rows quickly without scanning the entire table
Important point:
An index improves SELECT performance, but may slow down INSERT, UPDATE and DELETE operations slightly.
How Indexes Work
Imagine a phone directory:
- Names are sorted alphabetically
- You directly jump to the required section
Similarly, an index:
- Stores column values in a structured format
- Maintains pointers to actual table rows
This structure allows fast searching.
Types of Indexes
There are many types of indexes, but the two most important ones are:
- Clustered Index
- Non-Clustered Index
Understanding the difference between these two is very important for interviews and real-world systems.
Clustered Index
Data Stored in Sorted Order
A clustered index:
- Determines the physical order of data in a table
- Sorts and stores rows based on the indexed column
- Makes the table itself ordered
Think of it like:
A dictionary where words are stored in sorted order
Key Characteristics of Clustered Index
- Data rows are stored in sorted order
- Table can have only one clustered index
- Usually created on:
1. Primary key
2. Unique identifier
Example: Clustered Index on Employee ID
Example: Clustered Index on Employee ID
CREATE CLUSTERED INDEX idx_emp_idON employees(emp_id);Now:
- Rows in employees table are physically stored by emp_id
- Range queries become very fast
When Clustered Index Is Useful
- Searching by primary key
- Range queries (BETWEEN, <, >)
- Sorting large result sets
Non-Clustered Index
Separate Structure from Table Data
A non-clustered index:
- Does not change the physical order of data
- Stores index data separately from table
- Contains pointers to actual rows
Think of it like:
A book index at the back that points to page numbers
Key Characteristics of Non-Clustered Index
- Data remains in original order
- Table can have multiple non-clustered indexes
- Slightly slower than clustered index
- Uses extra storage space
Example: Non Clustered Index on Employee Name
CREATE NONCLUSTERED INDEX idx_emp_nameON employees(emp_name);Now:
- Searching by emp_name becomes faster
- Table data order remains unchanged
Difference Between Clustered and Non Clustered
| Feature | Clustered Index | Non Clustered Index |
|---|---|---|
| Physical order | Change table order | Does not change |
| Number per table | Only one | Multiple |
| Storage | Data + index together | Separate structure |
| Speed | Faster | Slightly slower |
| Use case | Primary key, ranges | Search, filtering |
Real-World Scenario
Consider an orders table:
- order_id - clustered index (unique, sequential)
- customer_id - non-clustered index (frequent searches)
- order_date - non-clustered index (filtering by date)
This combination provides both:
- Fast access
- Optimized performance
Impact of Indexes on DML Operations
While indexes improve reads:
- Inserts need index updates
- Updates may modify index entries
- Deletes must remove index references
Too many indexes can slow down write operations.
When NOT to Use Indexes
Avoid indexes when:
- Table is very small
- Column has very few distinct values
- Data changes frequently
Indexes are best for read-heavy systems.
Two Minute Drill
- Index improves data retrieval speed
- Clustered index defines physical order
- Only one clusterd index per table
- Non clustered index is separated from data
- Multiple non clustered indexes allowed
- Indexes speed up SELECT but slow down writes
- Very important for performance tuning