Deployment of DL Models
After training a model, you need to deploy it for inference. This chapter covers exporting models to a portable format (ONNX) and serving with TorchServe or TensorFlow Serving.
Option 1: Export to ONNX
ONNX (Open Neural Network Exchange) allows models to be used across frameworks.
import torch
dummy_input = torch.randn(1, 1, 28, 28)
torch.onnx.export(model, dummy_input, "model.onnx", verbose=True)Load ONNX model with ONNX Runtime for inference in any language.Option 2: TorchServe (PyTorch)
TorchServe provides a REST API for serving PyTorch models.
torch-model-archiver --model-name my_model --version 1.0 --model-file model.py --serialized-file model.pth --handler image_classifier
mkdir model_store
cp my_model.mar model_store/
torchserve --start --model-store model_store --models my_model=my_model.marThen send POST requests to `http://localhost:8080/predictions/my_model`.Option 3: TensorFlow Serving
Save a TensorFlow model in SavedModel format and serve.
model.save('saved_model/my_model')Run TensorFlow Serving container:docker run -p 8501:8501 --mount type=bind,source=/path/to/saved_model,target=/models/my_model -e MODEL_NAME=my_model -t tensorflow/servingOption 4: FastAPI + PyTorch (Custom)
Build a lightweight REST API with FastAPI.
from fastapi import FastAPI
import torch
app = FastAPI()
model = torch.load('model.pth')
@app.post("/predict")
def predict(data: dict):
input_tensor = preprocess(data)
output = model(input_tensor)
return {"prediction": output.tolist()}Two Minute Drill
- Export to ONNX for cross‑framework inference.
- TorchServe serves PyTorch models with REST API.
- TensorFlow Serving for TF models.
- Build custom API with FastAPI for simple deployments.
Need more clarification?
Drop us an email at career@quipoinfotech.com
