Build a RAG API with FastAPI

Project Link: View Project

Author: Ruth Kiarie
Email: wangukiarie@gmail.com

Introducing Today's Project!

In this project, I'm going to building a RAG pipeline with FASTAPI. This will help me gain insight into how building a RAG pipeline with FASTAPI differs from building it without an API. I'm interested in this because it will help me understand how RAG, FASTAPI, ChromaDB and Ollama all come together to work as a RAG assisted AI.

Key tools and concepts

The key tools I used include ChromaDB vector database, FASTAPI-powered /ask endpoint, uvicorn server. Key concepts I learnt include building a manual RAG pipeline, RAG API with FASTAPI and multi-user RAG systems, and how these systems are used for different purposes.

Challenges and wins

This project took me approximately 2 hours. This project grounded my understanding of RAG pipelines, API's, ChromaDB vector database, Swagger UI and how multi-users are added to the system.

Performing RAG Manually

In this step, I'm going to perform a manual RAG system and set up a Python environment. RAG stands for Retrieval Augmented Generation.

Understanding the three parts of RAG

I performed RAG manually by adding a personal knowledge base, then asking a question based on that data. The three parts include retrieving data from the knowledge base, prompting the AI and generation of an answer.

Comparing the two AI models

The key difference I noticed is that nomic embed-text converts text into numerical representations for search, while qwen2.5:0.5b generates responses for questions asked.

Building a Personal Knowledge Base

In this step, I'm going to build a Python script that loads, chunks text and stores my data as embeddings. Embeddings are text that are converted into vectors and stored in a database.

Creating the profile document

I included information about myself which grounds the answers generated by the LLM on the knowledge base added in the RAG pipeline.

How semantic search finds relevant chunks

When I ask a question, ChromaDB converts it into vectors and finds the vectors closest in that high dimensional space.

Creating the RAG API with FastAPI

In this step, I'm going to build an RAG API with FASTAPI. I'll test it using Swagger UI.

How the /ask endpoint works

When a question comes in, my endpoint performs the three steps of RAG. It first retrieves the most relevant chunks from ChromaDB, then augments the prompt combining the chunks with the question and generates an answer grounded in the model used.

Testing with Swagger UI

I tested my API by asking 'What is my name?'. The AI answered with 'Ruth'. The context used were the exact chunks that ChromaDB retrieved from my knowledge base.

Extending to a Multi-User AI Directory

In this project extension, I'm adding multi-user support because in the real world RAG systems almost always serve multiple users or data sources. Multi-tenancy means keeping different users data secure, safe and cost effective, as the RAG system is scaled to handle many users.

Adding the POST /documents endpoint

In this project extension, I added a POST endpoint that dynamically ingests profiles for different users. Metadata filtering allows for the querying of the results based on the requested user.

Verifying multi-user filtering

In this project extension, I tested multi-user queries by by passing user=Jordan in the GET\ask parameters. The filter works because it returns results for Jordan's data only.

Wrapping Up

I did this project today to learn how to connect to ChromaDB and store text vectors in the database, how to query documents using FASTAPI and connect to Swagger UI, so as to generate accurate answers from the stored chunks. Another skill I want to learn is how Docker is used to containerize the data.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
build_knowledge_base.py		build_knowledge_base.py
main.py		main.py
profile.txt		profile.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Build a RAG API with FastAPI

Introducing Today's Project!

Key tools and concepts

Challenges and wins

Performing RAG Manually

Understanding the three parts of RAG

Comparing the two AI models

Building a Personal Knowledge Base

Creating the profile document

How semantic search finds relevant chunks

Creating the RAG API with FastAPI

How the /ask endpoint works

Testing with Swagger UI

Extending to a Multi-User AI Directory

Adding the POST /documents endpoint

Verifying multi-user filtering

Wrapping Up

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Build a RAG API with FastAPI

Introducing Today's Project!

Key tools and concepts

Challenges and wins

Performing RAG Manually

Understanding the three parts of RAG

Comparing the two AI models

Building a Personal Knowledge Base

Creating the profile document

How semantic search finds relevant chunks

Creating the RAG API with FastAPI

How the /ask endpoint works

Testing with Swagger UI

Extending to a Multi-User AI Directory

Adding the POST /documents endpoint

Verifying multi-user filtering

Wrapping Up

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages