Query Optimization
In the real world, writing a query that works is only the first step.
Writing a query that works fast is what matters in production systems.
As databases grow:
- Tables become large
- Queries become complex
- Response time becomes critical
A query that takes:
- 10 ms today
- can take seconds or minutes tomorrow
This is why query optimization is an essential skill for every SQL developer.
To optimize queries, we first need to understand how the database executes a query. That understanding comes from EXPLAIN and execution plans.
What Is Query Optimization?
Query optimization is the process of:
- Improving query performance
- Reducing execution time
- Minimizing resource usage (CPU, memory, I/O)
The goal is not to change the result, but to change how efficiently the result is produced.
How a Database Executes a Query
When you run a SQL query, the database does much more than just reading data.
Internally, it:
1. Parses the SQL
2. Checks syntax and permissions
3. Builds multiple possible execution strategies
4. Chooses the most efficient one
5. Executes the chosen plan
The chosen strategy is called the execution plan.
What Is an Execution Plan?
An execution plan is:
- A step-by-step blueprint of how a query will run
- Created by the database optimizer
- Used to decide:
1. Which indexes to use
2. How tables are accessed
3. In what order operations occur
Think of it as:
A GPS route chosen before starting a journey
Different routes can reach the same destination, but some are faster.
Introducing EXPLAIN
The EXPLAIN command shows the execution plan of a query without actually running it.
It helps you understand:
- How data is accessed
- Whether indexes are used
- Where performance problems may exist
Syntax
EXPLAIN SELECT * FROM employees WHERE salary > 50000;This does not return table data.
Instead, it returns information about how the query will execute.
Reading an EXPLAIN Output
Although EXPLAIN output varies by database (MySQL, PostgreSQL, Oracle), the core ideas are similar.
Key things it reveals:
- Table access method
- Index usage
- Estimated rows scanned
- Join strategy
You do not need to memorize all fields initially focus on patterns.
Difference Between Table Scan and Index Scan
Full Table Scan
- Database reads every row
- Slower for large tables
- Often a sign of missing indexes
Index Scan
- Database uses an index to locate rows
- Much faster
- Preferred for filtering conditions
EXPLAIN helps identify which one is happening.
Example: Query Without an Index
EXPLAINSELECT emp_nameFROM employeesWHERE department_id = 10;If no index exists on department_id:
- Database performs a full table scan
- Performance degrades as data grows
Example: Query With an Index
CREATE INDEX idx_deptON employees(department_id);EXPLAINSELECT emp_nameFROM employeesWHERE department_id = 10;Now:
- Database uses the index
- Fewer rows are scanned
- Query executes faster
Understanding Join Order Through Execution Plans
For queries with joins, EXPLAIN shows:
- Which table is read first
- How tables are joined
- Whether indexes are used in joins
Choosing the right join order can significantly impact performance.
Cost-Based Optimization
Modern databases use a cost-based optimizer.
This means:
- The database estimates cost for multiple plans
- Cost is based on:
1. Data size
2. Index selectivity
3. Statistics
- The lowest-cost plan is chosen
EXPLAIN shows you the plan chosen by the optimizer.
Common Optimization Signals to Watch For
When reading execution plans, look for:
- Unexpected full table scans
- Missing or unused indexes
- Very high estimated row counts
- Complex nested loops on large tables
These often indicate performance problems.
EXPLAIN vs EXPLAIN ANALYZE
Some databases support:
- EXPLAIN - estimated plan
- EXPLAIN ANALYZE - actual execution statistics
EXPLAIN ANALYZE:
- Runs the query
- Shows real execution time
- Helps validate optimizer decisions
This is useful for deep performance tuning.
Two Minute Drill
- Query optimization improves performance
- Execution plan shows how a query runs
- EXPLAIN displays the execution plan
- Full table scans are slow
- Index scans are faster
- Optimizer chooses the lowest cost plan
- EXPLAIN is essential for debugging slow queries