Skip to content

RuthKiarie/RAGAPI_built_with_FastAPI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

NextWork

Build a RAG API with FastAPI

Project Link: View Project

Author: Ruth Kiarie
Email: wangukiarie@gmail.com



Introducing Today's Project!

In this project, I'm going to building a RAG pipeline with FASTAPI. This will help me gain insight into how building a RAG pipeline with FASTAPI differs from building it without an API. I'm interested in this because it will help me understand how RAG, FASTAPI, ChromaDB and Ollama all come together to work as a RAG assisted AI.

Key tools and concepts

The key tools I used include ChromaDB vector database, FASTAPI-powered /ask endpoint, uvicorn server. Key concepts I learnt include building a manual RAG pipeline, RAG API with FASTAPI and multi-user RAG systems, and how these systems are used for different purposes.

Challenges and wins

This project took me approximately 2 hours. This project grounded my understanding of RAG pipelines, API's, ChromaDB vector database, Swagger UI and how multi-users are added to the system.


Performing RAG Manually

In this step, I'm going to perform a manual RAG system and set up a Python environment. RAG stands for Retrieval Augmented Generation.

Image

Understanding the three parts of RAG

I performed RAG manually by adding a personal knowledge base, then asking a question based on that data. The three parts include retrieving data from the knowledge base, prompting the AI and generation of an answer.

Comparing the two AI models

The key difference I noticed is that nomic embed-text converts text into numerical representations for search, while qwen2.5:0.5b generates responses for questions asked.


Building a Personal Knowledge Base

In this step, I'm going to build a Python script that loads, chunks text and stores my data as embeddings. Embeddings are text that are converted into vectors and stored in a database.

Image

Creating the profile document

I included information about myself which grounds the answers generated by the LLM on the knowledge base added in the RAG pipeline.

How semantic search finds relevant chunks

When I ask a question, ChromaDB converts it into vectors and finds the vectors closest in that high dimensional space.


Creating the RAG API with FastAPI

In this step, I'm going to build an RAG API with FASTAPI. I'll test it using Swagger UI.

Image

How the /ask endpoint works

When a question comes in, my endpoint performs the three steps of RAG. It first retrieves the most relevant chunks from ChromaDB, then augments the prompt combining the chunks with the question and generates an answer grounded in the model used.

Testing with Swagger UI

I tested my API by asking 'What is my name?'. The AI answered with 'Ruth'. The context used were the exact chunks that ChromaDB retrieved from my knowledge base.


Extending to a Multi-User AI Directory

In this project extension, I'm adding multi-user support because in the real world RAG systems almost always serve multiple users or data sources. Multi-tenancy means keeping different users data secure, safe and cost effective, as the RAG system is scaled to handle many users.

Image

Adding the POST /documents endpoint

In this project extension, I added a POST endpoint that dynamically ingests profiles for different users. Metadata filtering allows for the querying of the results based on the requested user.

Image

Verifying multi-user filtering

In this project extension, I tested multi-user queries by by passing user=Jordan in the GET\ask parameters. The filter works because it returns results for Jordan's data only.


Wrapping Up

I did this project today to learn how to connect to ChromaDB and store text vectors in the database, how to query documents using FASTAPI and connect to Swagger UI, so as to generate accurate answers from the stored chunks. Another skill I want to learn is how Docker is used to containerize the data.



About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages