Skip to content

HimabinduPyata/ai-document-qa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Document Intelligence (RAG System)

A production-style Retrieval-Augmented Generation (RAG) system that allows users to upload documents (PDFs) and ask intelligent questions with context-aware AI responses.

Built using FAISS vector search, OpenAI embeddings, and GPT-4o-mini, this project demonstrates how modern AI applications retrieve and reason over private documents.


Live Demo

https://your-streamlit-app-link.streamlit.app


Key Features

  • Upload and analyze PDF documents
  • Semantic search using embeddings
  • FAISS vector database for fast retrieval
  • Context-aware AI question answering
  • Multi-chunk retrieval (top-k context)
  • Retrieval-Augmented Generation (RAG pipeline)
  • Clean Streamlit UI

Architecture

PDF Upload
   ↓
Text Extraction
   ↓
Chunking
   ↓
Embeddings (OpenAI)
   ↓
FAISS Vector Search
   ↓
Top-K Relevant Context
   ↓
LLM (GPT-4o-mini)
   ↓
Answer Generation

Tech Stack

  • Python
  • Streamlit
  • OpenAI API
  • FAISS (Vector Database)
  • NumPy
  • PyPDF

What This Project Demonstrates

  • Retrieval-Augmented Generation (RAG)
  • Vector similarity search
  • Embedding-based AI systems
  • LLM orchestration
  • Real-world AI application design

Example Use Cases

  • Resume Q&A assistant
  • Legal document analysis
  • Study material assistant
  • Company knowledge base chatbot

Future Improvements

  • Page-level citations
  • Multi-document chat memory
  • Streaming responses
  • Authentication system
  • Cloud deployment (SaaS version)

What I learned:

  • How RAG systems work in production
  • Vector databases (FAISS)
  • Embedding-based semantic search
  • Designing LLM-powered applications beyond simple prompting
  • Turning AI models into real product workflows

Why this project matters

This project demonstrates how modern AI applications move beyond simple prompting into retrieval-based reasoning systems, similar to production tools like ChatGPT’s file upload, Notion AI, and enterprise knowledge assistants.

About

Production-style RAG system with FAISS vector database, OpenAI embeddings, and GPT-powered contextual question answering over PDF documents.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages