Model Persistence
After training a model, you want to save it to disk and load it later without retraining. This is called model persistence. The two most common methods are `pickle` and `joblib`.
Using `joblib` (Recommended for scikit‑learn)
import joblib
# Save model
joblib.dump(model, 'model.joblib')
# Load model
loaded_model = joblib.load('model.joblib')
# Predict with loaded model
predictions = loaded_model.predict(X_test)Using `pickle` (Python built‑in)
import pickle
# Save model
with open('model.pkl', 'wb') as f:
pickle.dump(model, f)
# Load model
with open('model.pkl', 'rb') as f:
loaded_model = pickle.load(f)Why Save Models?
- Avoid retraining (time‑consuming).
- Deploy models in production applications.
- Share models with team members.
- Keep a baseline for future comparison.
Best Practices
- Save preprocessors (scalers, encoders) together with the model.
- Use `joblib` for large NumPy arrays (more efficient).
- Version control model files? Usually no (large files). Use DVC or cloud storage.
Two Minute Drill
- Save models with `joblib.dump()` or `pickle.dump()`.
- Load with `joblib.load()` or `pickle.load()`.
- `joblib` is preferred for scikit‑learn models.
- Save preprocessors together with the model.
Need more clarification?
Drop us an email at career@quipoinfotech.com
