Learning Rate Schedulers
Learning rate schedulers adjust the learning rate during training. Starting with a higher LR then decreasing it over time helps the model converge to a better minimum.
Step Decay
Reduce learning rate by a factor every fixed number of epochs. Example: reduce by 0.5 every 30 epochs.
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.5)Exponential Decay
LR = initial_lr * gamma^(epoch). Smoother than step decay.
scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.95)Reduce on Plateau
Reduce LR when validation loss stops improving. This is adaptive.
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=5)Cosine Annealing
Cyclically varies LR following a cosine curve. Often combined with warm restarts (CosineAnnealingWarmRestarts).
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=50)One Cycle Policy
Increases LR first then decreases; popularized by fast.ai and used in many state‑of‑the‑art training recipes.
Two Minute Drill
- Learning rate schedulers improve convergence.
- Step decay: reduce at fixed intervals.
- Reduce on plateau: reduce when validation loss stalls.
- Cosine annealing: cyclical schedule.
Need more clarification?
Drop us an email at career@quipoinfotech.com
