Loading

Quipoin Menu

Learn • Practice • Grow

deep-learning / Deployment of DL Models
tutorial

Deployment of DL Models

After training a model, you need to deploy it for inference. This chapter covers exporting models to a portable format (ONNX) and serving with TorchServe or TensorFlow Serving.

Option 1: Export to ONNX

ONNX (Open Neural Network Exchange) allows models to be used across frameworks.
import torch

dummy_input = torch.randn(1, 1, 28, 28)
torch.onnx.export(model, dummy_input, "model.onnx", verbose=True)
Load ONNX model with ONNX Runtime for inference in any language.

Option 2: TorchServe (PyTorch)

TorchServe provides a REST API for serving PyTorch models.
torch-model-archiver --model-name my_model --version 1.0 --model-file model.py --serialized-file model.pth --handler image_classifier
mkdir model_store
cp my_model.mar model_store/
torchserve --start --model-store model_store --models my_model=my_model.mar
Then send POST requests to `http://localhost:8080/predictions/my_model`.

Option 3: TensorFlow Serving

Save a TensorFlow model in SavedModel format and serve.
model.save('saved_model/my_model')
Run TensorFlow Serving container:
docker run -p 8501:8501 --mount type=bind,source=/path/to/saved_model,target=/models/my_model -e MODEL_NAME=my_model -t tensorflow/serving

Option 4: FastAPI + PyTorch (Custom)

Build a lightweight REST API with FastAPI.
from fastapi import FastAPI
import torch

app = FastAPI()
model = torch.load('model.pth')

@app.post("/predict")
def predict(data: dict):
input_tensor = preprocess(data)
output = model(input_tensor)
return {"prediction": output.tolist()}


Two Minute Drill
  • Export to ONNX for cross‑framework inference.
  • TorchServe serves PyTorch models with REST API.
  • TensorFlow Serving for TF models.
  • Build custom API with FastAPI for simple deployments.

Need more clarification?

Drop us an email at career@quipoinfotech.com