Project: Production RAG API
In this project, you will expose your RAG system as a production‑ready API using FastAPI and Docker. The API will accept a PDF upload, index it, and answer questions via a REST endpoint.
Project 3: Production RAG API with FastAPI + Docker.
Step 1: Create the FastAPI App
Create `app.py`:
from fastapi import FastAPI, UploadFile, File
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
import os
import tempfile
app = FastAPI()
@app.post("/upload")
async def upload_pdf(file: UploadFile = File(...)):
with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as tmp:
tmp.write(await file.read())
tmp_path = tmp.name
loader = PyPDFLoader(tmp_path)
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(docs)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)
os.remove(tmp_path)
return {"message": "PDF indexed successfully", "chunks": len(chunks)}
@app.post("/ask")
def ask(question: str):
# In a real app, you'd persist the vectorstore per session.
# For demo, we assume global vectorstore exists.
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
qa = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever())
answer = qa.run(question)
return {"answer": answer}Step 2: Create a Dockerfile
FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]Step 3: Build and Run
docker build -t rag-api .
docker run -p 8000:8000 -e OPENAI_API_KEY=your-key rag-apiStep 4: Test the API
Use `curl` or Postman:
curl -X POST -F "file=@doc.pdf" http://localhost:8000/upload
curl "http://localhost:8000/ask?question=What+is+RAG?"What You Learned
- Building a REST API with FastAPI.
- Dockerising a RAG application.
- Handling file uploads and temporary storage.
- Exposing RAG as a service.
Two Minute Drill
- FastAPI for API endpoints.
- Docker for containerisation.
- Use `curl` or Postman to test.
- Production RAG requires session management or persistent vector store.
Need more clarification?
Drop us an email at career@quipoinfotech.com
