Deployment of DL Models

After training a model, you need to deploy it for inference. This chapter covers exporting models to a portable format (ONNX) and serving with TorchServe or TensorFlow Serving.

Option 1: Export to ONNX

ONNX (Open Neural Network Exchange) allows models to be used across frameworks.

import torch

dummy_input = torch.randn(1, 1, 28, 28)
torch.onnx.export(model, dummy_input, "model.onnx", verbose=True)

Load ONNX model with ONNX Runtime for inference in any language.

Option 2: TorchServe (PyTorch)

TorchServe provides a REST API for serving PyTorch models.

torch-model-archiver --model-name my_model --version 1.0 --model-file model.py --serialized-file model.pth --handler image_classifier
mkdir model_store
cp my_model.mar model_store/
torchserve --start --model-store model_store --models my_model=my_model.mar

Then send POST requests to `http://localhost:8080/predictions/my_model`.

Option 3: TensorFlow Serving

Save a TensorFlow model in SavedModel format and serve.

model.save('saved_model/my_model')

Run TensorFlow Serving container:

docker run -p 8501:8501 --mount type=bind,source=/path/to/saved_model,target=/models/my_model -e MODEL_NAME=my_model -t tensorflow/serving

Option 4: FastAPI + PyTorch (Custom)

Build a lightweight REST API with FastAPI.

from fastapi import FastAPI
import torch

app = FastAPI()
model = torch.load('model.pth')

@app.post("/predict")
def predict(data: dict):
    input_tensor = preprocess(data)
    output = model(input_tensor)
    return {"prediction": output.tolist()}

Two Minute Drill

Export to ONNX for cross‑framework inference.
TorchServe serves PyTorch models with REST API.
TensorFlow Serving for TF models.
Build custom API with FastAPI for simple deployments.

Need more clarification?

Drop us an email at career@quipoinfotech.com

Welcome to Quipoin

Quipoin Menu

Deployment of DL Models

Need more clarification?