Loading

Quipoin Menu

Learn • Practice • Grow

rag / Project: Production RAG API
tutorial

Project: Production RAG API

In this project, you will expose your RAG system as a production‑ready API using FastAPI and Docker. The API will accept a PDF upload, index it, and answer questions via a REST endpoint.

Project 3: Production RAG API with FastAPI + Docker.

Step 1: Create the FastAPI App

Create `app.py`:
from fastapi import FastAPI, UploadFile, File
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
import os
import tempfile

app = FastAPI()

@app.post("/upload")
async def upload_pdf(file: UploadFile = File(...)):
with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as tmp:
tmp.write(await file.read())
tmp_path = tmp.name
loader = PyPDFLoader(tmp_path)
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(docs)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)
os.remove(tmp_path)
return {"message": "PDF indexed successfully", "chunks": len(chunks)}

@app.post("/ask")
def ask(question: str):
# In a real app, you'd persist the vectorstore per session.
# For demo, we assume global vectorstore exists.
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
qa = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever())
answer = qa.run(question)
return {"answer": answer}

Step 2: Create a Dockerfile

FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Step 3: Build and Run

docker build -t rag-api .
docker run -p 8000:8000 -e OPENAI_API_KEY=your-key rag-api

Step 4: Test the API

Use `curl` or Postman:
curl -X POST -F "file=@doc.pdf" http://localhost:8000/upload
curl "http://localhost:8000/ask?question=What+is+RAG?"

What You Learned

  • Building a REST API with FastAPI.
  • Dockerising a RAG application.
  • Handling file uploads and temporary storage.
  • Exposing RAG as a service.


Two Minute Drill
  • FastAPI for API endpoints.
  • Docker for containerisation.
  • Use `curl` or Postman to test.
  • Production RAG requires session management or persistent vector store.

Need more clarification?

Drop us an email at career@quipoinfotech.com