Project: Production RAG API

In this project, you will expose your RAG system as a production‑ready API using FastAPI and Docker. The API will accept a PDF upload, index it, and answer questions via a REST endpoint.

Project 3: Production RAG API with FastAPI + Docker.

Step 1: Create the FastAPI App

Create `app.py`:

from fastapi import FastAPI, UploadFile, File
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
import os
import tempfile

app = FastAPI()

@app.post("/upload")
async def upload_pdf(file: UploadFile = File(...)):
    with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as tmp:
        tmp.write(await file.read())
        tmp_path = tmp.name
    loader = PyPDFLoader(tmp_path)
    docs = loader.load()
    splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    chunks = splitter.split_documents(docs)
    embeddings = OpenAIEmbeddings()
    vectorstore = Chroma.from_documents(chunks, embeddings)
    os.remove(tmp_path)
    return {"message": "PDF indexed successfully", "chunks": len(chunks)}

@app.post("/ask")
def ask(question: str):
    # In a real app, you'd persist the vectorstore per session.
    # For demo, we assume global vectorstore exists.
    llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
    qa = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever())
    answer = qa.run(question)
    return {"answer": answer}

Step 2: Create a Dockerfile

FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Step 3: Build and Run

docker build -t rag-api .
docker run -p 8000:8000 -e OPENAI_API_KEY=your-key rag-api

Step 4: Test the API

Use `curl` or Postman:

curl -X POST -F "file=@doc.pdf" http://localhost:8000/upload
curl "http://localhost:8000/ask?question=What+is+RAG?"

What You Learned

Building a REST API with FastAPI.
Dockerising a RAG application.
Handling file uploads and temporary storage.
Exposing RAG as a service.

Two Minute Drill

FastAPI for API endpoints.
Docker for containerisation.
Use `curl` or Postman to test.
Production RAG requires session management or persistent vector store.

Need more clarification?

Drop us an email at career@quipoinfotech.com

Welcome to Quipoin

Quipoin Menu

Project: Production RAG API

Need more clarification?