🦜 LangChain Practice — GenAI + RAG Pipeline

A hands-on end-to-end GenAI project built with LangChain, Groq (LLaMA 3.3), FAISS VectorStore, and Streamlit — featuring a fully deployed Question Answering bot with LangSmith monitoring.

🚀 Live Demo

📌 Project Overview

This project demonstrates a complete Retrieval-Augmented Generation (RAG) pipeline using LangChain, where a PDF document is ingested, embedded, stored in a vector database, and queried using a powerful LLM — all wrapped in a clean Streamlit UI.

The bot acts as an AI Engineer assistant, answering questions strictly based on the provided PDF context.

🏗️ Architecture

PDF Document
     │
     ▼
[Data Loading] ──► PyPDFLoader / TextLoader
     │
     ▼
[Text Splitting] ──► CharacterTextSplitter / RecursiveCharacterTextSplitter
     │
     ▼
[Embedding] ──► HuggingFace Embeddings (sentence-transformers)
     │
     ▼
[VectorStore] ──► FAISS Index (faiss_idx/)
     │
     ▼
[Retriever] ──► Similarity Search
     │
     ▼
[LLM] ──► Groq API → LLaMA 3.3 (70B)
     │
     ▼
[Response] ──► Streamlit UI (app.py)
     │
     ▼
[Monitoring] ──► LangSmith Tracing

🧩 Project Structure

LangChain_practice/
│
├── LangChain/
│   ├── Data_Embedding/          # Embedding + VectorStore notebooks
│   ├── faiss_idx/               # Persisted FAISS vector index
│   │
│   ├── Data_extraction.ipynb    # RAG pipeline: load → split → retrieve
│   ├── llmModel.ipynb           # RAG + Groq LLaMA 3.3 integration
│   ├── genai_App.ipynb          # Full GenAI app prototype (notebook)
│   │
│   ├── app.py                   # 🚀 Streamlit app — deployed QA Bot
│   ├── document.pdf             # Source PDF for knowledge base
│   ├── Synopsis_Major_project.pdf
│   ├── test.txt                 # Test ingestion file
│
├── Requirements.txt
├── .gitignore
└── README.md

⚙️ Tech Stack

Layer	Tool / Library
Framework	LangChain
LLM	Groq API — LLaMA 3.3 (70B Versatile)
Embeddings	HuggingFace `sentence-transformers`
Vector DB	FAISS (persisted locally)
Frontend	Streamlit
Monitoring	LangSmith
Language	Python 3.10+

🔄 Pipeline Breakdown

1️⃣ Data Ingestion & Splitting

Loaded PDF using PyPDFLoader
Split into chunks using RecursiveCharacterTextSplitter
Explored chunk size, overlap, and splitting strategies

2️⃣ Embedding & VectorStore

Generated embeddings using HuggingFaceEmbeddings
Stored and persisted vectors using FAISS (faiss_idx/)
Reloaded index for retrieval without re-embedding

3️⃣ RAG + LLM (llmModel.ipynb)

Built a RetrievalQA chain using LangChain
Connected FAISS retriever with Groq's LLaMA 3.3
Extracted context-aware answers from the PDF

4️⃣ Streamlit App (app.py)

Built a conversational QA interface
System prompt: "Act as an AI Engineer to answer questions based on the given context"
Used ChatGroq with streaming support
Deployed live on Streamlit Cloud

5️⃣ Monitoring with LangSmith

Integrated LangSmith tracing for all chain runs
Tracked inputs, outputs, token usage, and latency

🛠️ Setup & Installation

1. Clone the repo

git clone https://github.com/ravicoder01/LangChain_practice.git
cd LangChain_practice

2. Install dependencies

pip install -r Requirements.txt

3. Set environment variables

Create a .env file in the root:

GROQ_API_KEY=your_groq_api_key
LANGCHAIN_API_KEY=your_langsmith_api_key
LANGCHAIN_TRACING_V2=true
LANGCHAIN_PROJECT=langchain-practice

4. Run the Streamlit app

cd LangChain
streamlit run app.py

📦 Requirements

langchain
langchain-community
langchain-groq
faiss-cpu
sentence-transformers
pypdf
streamlit
python-dotenv
langsmith

💡 Key Learnings

How RAG solves LLM hallucination by grounding responses in real documents
FAISS index persistence — embed once, query many times
Difference between CharacterTextSplitter and RecursiveCharacterTextSplitter
Using Groq for blazing-fast LLaMA inference
LangSmith for production-grade observability

👨‍💻 Author

Ravi Roy B.Tech CSE (AI & ML) | IILM University, Greater Noida

"RAG is not just a technique — it's the bridge between static LLMs and dynamic real-world knowledge."

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
LangChain		LangChain
Q&A chatbot		Q&A chatbot
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦜 LangChain Practice — GenAI + RAG Pipeline

🚀 Live Demo

📌 Project Overview

🏗️ Architecture

🧩 Project Structure

⚙️ Tech Stack

🔄 Pipeline Breakdown

1️⃣ Data Ingestion & Splitting

2️⃣ Embedding & VectorStore

3️⃣ RAG + LLM (llmModel.ipynb)

4️⃣ Streamlit App (app.py)

5️⃣ Monitoring with LangSmith

🛠️ Setup & Installation

1. Clone the repo

2. Install dependencies

3. Set environment variables

4. Run the Streamlit app

📦 Requirements

💡 Key Learnings

👨‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🦜 LangChain Practice — GenAI + RAG Pipeline

🚀 Live Demo

📌 Project Overview

🏗️ Architecture

🧩 Project Structure

⚙️ Tech Stack

🔄 Pipeline Breakdown

1️⃣ Data Ingestion & Splitting

2️⃣ Embedding & VectorStore

3️⃣ RAG + LLM (llmModel.ipynb)

4️⃣ Streamlit App (app.py)

5️⃣ Monitoring with LangSmith

🛠️ Setup & Installation

1. Clone the repo

2. Install dependencies

3. Set environment variables

4. Run the Streamlit app

📦 Requirements

💡 Key Learnings

👨‍💻 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages