Deploying Agentic RAG to Production – Part 2

Building the Search API and Unified Gateway to Power End-to-End Agentic Workflows

Aug 04, 2025

I’ve just published the second installment of my Deploying Agentic RAG to Production tutorial series.

This part focuses on building the Search API for Agentic RAG using FastAPI, designed to orchestrate an AI agent powered by LangGraph. The agent coordinates multiple tools to retrieve the best possible answers, forming the core of an agentic RAG pipeline. I also unified this Search API with my previous Ingestion API under a single FastAPI gateway - creating a clean, production-ready interface.

You’ll find all the code and setup instructions in the GitHub repo:
🔗 https://github.com/shaikhq/agentic-rag-db2

To go with the 2nd installment of the tutorial series, I’ve recorded two short videos:

Implementation Walkthrough – A guided tour of the repo and how the APIs are structured
Live Demo – A hands-on demonstration of calling the APIs using simple curl commands

Both videos are included below.

This builds directly on the first part of the series, where I focused on turning the document ingestion logic into a standalone API. If you missed that, you can find it in the GitHub repo as well.

First Video: Deploy Agentic RAG to Production, Part 2 - Build the Search API End Point - Code Walkthrough

Second Video: Agentic RAG Search and Gateway APIs Demo

In the following video, I run a demo of the search and the gateway APIs using simple curl commands across three scenarios:

Ingesting new knowledge into the vector store using the ingestion API
Querying the Search API to retrieve the best possible answer through the LangGraph agent
Clearing the vector store using the cleanup API, giving the system a clean slate

These APIs form the foundation for building an Agentic RAG application on top of the ingestion and retrieval pipeline.

If you have missed the first installment of my deploying Agentic RAG to production tutorial series, here’s the link:

Deploying an AI Agent in Production: FastAPI Data Ingestion

AI Architect's Playbook

Discussion about this post