Skip to content

goobolabs/dastuur-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

20 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ‡ธ๐Ÿ‡ด Dastuur Agent - Somali Constitution AI Assistant

Next.js TypeScript Google Gemini License

An intelligent AI-powered assistant that helps users understand and navigate the Somali Provisional Constitution. Built with Next.js, Google Gemini, and Memvid AI.

Dastuur Agent


โœจ Features

๐Ÿค– AI-Powered Constitutional Assistant

  • Natural language understanding in Somali
  • Accurate answers based on the Somali Provisional Constitution
  • Context-aware responses using RAG technology

๐Ÿ’ฌ Advanced Chat Interface

  • Multi-chat support - Create and manage multiple conversation threads
  • Chat history - Persistent storage using localStorage
  • Sidebar navigation - Easy access to all your conversations
  • Real-time responses - Streaming AI responses with loading indicators
  • Markdown support - Rich text formatting in responses

๐ŸŽจ Premium UI/UX

  • Modern, glassmorphic design
  • Smooth animations and transitions
  • Responsive layout for all devices
  • Dark mode ready
  • Accessible and user-friendly interface

๐Ÿ” Intelligent Search

  • Hybrid search combining vector embeddings and keyword matching
  • Special optimization for article number queries (e.g., "Qodobka 3aad")
  • Contextual relevance scoring
  • Top-K retrieval for accurate results

๐Ÿš€ Getting Started

Prerequisites

  • Node.js 18.x or higher
  • npm or yarn
  • Google Gemini API Key (Get one here)

Installation

  1. Clone the repository

    git clone https://github.com/omartood/Dastuur-agent-.git
    cd Dastuur-agent-
  2. Install dependencies

    npm install
  3. Set up environment variables

    Create a .env file in the root directory:

    GEMINI_API_KEY=your_gemini_api_key_here
  4. Prepare the knowledge base

    Place your Somali Constitution PDF in the pdf/ directory:

    pdf/Dastuurka_KMG_Soomaaliya.pdf
    
  5. Generate embeddings

    Run the ingestion script to process the PDF and create embeddings:

    npm run ingest

    This will:

    • Extract text from the PDF
    • Split content into chunks
    • Generate embeddings using Google Gemini
    • Store the vector database in data/store.json
  6. Start the development server

    npm run dev
  7. Open your browser

    Navigate to http://localhost:3000


๐Ÿ“ Project Structure

dastur-agents/
โ”œโ”€โ”€ app/                          # Next.js app directory
โ”‚   โ”œโ”€โ”€ api/                      # API routes
โ”‚   โ”‚   โ””โ”€โ”€ ask/                  # Chat endpoint
โ”‚   โ”œโ”€โ”€ components/               # React components
โ”‚   โ”‚   โ””โ”€โ”€ Sidebar.tsx          # Chat history sidebar
โ”‚   โ”œโ”€โ”€ globals.css              # Global styles
โ”‚   โ”œโ”€โ”€ layout.tsx               # Root layout
โ”‚   โ””โ”€โ”€ page.tsx                 # Main chat interface
โ”œโ”€โ”€ lib/                          # Core libraries
โ”‚   โ”œโ”€โ”€ chatStorage.ts           # Chat persistence utilities
โ”‚   โ”œโ”€โ”€ gemini.ts                # Gemini API integration
โ”‚   โ””โ”€โ”€ memvid.ts                # Vector search & RAG
โ”œโ”€โ”€ scripts/                      # Utility scripts
โ”‚   โ”œโ”€โ”€ ingest.ts                # PDF ingestion & embedding
โ”‚   โ”œโ”€โ”€ test-search.ts           # Search testing utility
โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ data/                         # Generated data
โ”‚   โ””โ”€โ”€ store.json               # Vector database
โ”œโ”€โ”€ pdf/                          # Source documents
โ”‚   โ””โ”€โ”€ Dastuurka_KMG_Soomaaliya.pdf
โ”œโ”€โ”€ .env                          # Environment variables
โ”œโ”€โ”€ package.json                  # Dependencies
โ”œโ”€โ”€ tsconfig.json                # TypeScript config
โ””โ”€โ”€ README.md                     # This file

๐Ÿ› ๏ธ Technology Stack

Frontend

AI & Backend

  • Google Gemini AI - Large language model
  • Memvid AI - Intelligent vector search & RAG pipeline
  • pdf-parse - PDF text extraction
  • Custom RAG Implementation - Context-aware retrieval
  • Vector Embeddings - Semantic search with cosine similarity

Storage

  • LocalStorage - Client-side chat persistence
  • JSON Store - Vector database storage

๐Ÿง  How It Works

1. Document Ingestion

// scripts/ingest.ts
1. Load PDF โ†’ Extract text
2. Split into semantic chunks
3. Generate embeddings via Gemini
4. Store in vector database

2. Query Processing

// app/api/ask/route.ts
1. User asks question
2. Generate query embedding
3. Search vector database (hybrid: vector + keyword)
4. Retrieve top-K relevant chunks
5. Send to Gemini with context
6. Return AI-generated answer

3. Hybrid Search Algorithm

// lib/memvid.ts
Score = (0.6 ร— Vector_Similarity) + (0.4 ร— Keyword_Match)
+ Special boost for article number queries

๐Ÿ“ Available Scripts

Command Description
npm run dev Start development server
npm run build Build for production
npm start Start production server
npm run lint Run ESLint
npm run ingest Process PDF and generate embeddings

๐Ÿ”ง Configuration

Adjusting Search Parameters

Edit lib/memvid.ts to customize:

// Number of results to retrieve
const topK = 5;

// Vector vs Keyword weighting
score = cosine * 0.6 + kwScore * 0.4;

// Chunk size for document splitting
const chunkSize = 1000;

Customizing the UI

  • Colors: Edit tailwind.config.ts
  • Styles: Modify app/globals.css
  • Components: Update files in app/components/

๐ŸŒŸ Key Features Explained

Multi-Chat Support

Users can create multiple conversation threads, each with its own history. Chats are automatically saved and can be switched between seamlessly.

Intelligent Article Search

The system recognizes Somali article patterns (e.g., "Qodobka 3aad") and applies special scoring to ensure the correct article is retrieved, not just table of contents entries.

Context-Aware Responses

Using RAG, the AI retrieves relevant sections from the constitution before generating answers, ensuring accuracy and grounding in the actual document.

Persistent Chat History

All conversations are saved in the browser's localStorage, allowing users to return to previous discussions anytime.


๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the ISC License.


๐Ÿ‘จโ€๐Ÿ’ป Author

Omar Tood


๐Ÿ™ Acknowledgments

  • Google Gemini AI for providing the language model
  • The Somali government for making the constitution publicly available
  • The open-source community for amazing tools and libraries

๐Ÿ“ž Support

If you have any questions or need help, please:

  • Open an issue on GitHub
  • Contact the maintainer

๐Ÿ”ฎ Future Enhancements

  • Multi-language support (English, Arabic)
  • Voice input/output
  • Export chat history
  • Share conversations
  • Advanced search filters
  • Mobile app version
  • Offline mode
  • User authentication
  • Cloud sync for chat history

Made with โค๏ธ for the Somali people

โญ Star this repo if you find it helpful!

About

An intelligent AI-powered assistant that helps users understand and navigate the Somali Provisional Constitution.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors