Skip to content

Releases: devmehtaa/codeAssist

v1.0.0 - Serverless Codebase Assistant

27 May 09:07

Choose a tag to compare

v1.0.0 - Serverless Codebase Assistant

Production-ready MVP for semantic GitHub repository understanding using Retrieval-Augmented Generation (RAG), vector search, and serverless AWS infrastructure.

Live Demo

https://d1femwt9slkevk.cloudfront.net/
or

https://shorturl.at/t59If


Overview

Codebase Assistant allows users to:

  • Paste a GitHub repository URL
  • Automatically clone and index the repository
  • Generate semantic embeddings for source code
  • Ask natural language questions about the codebase
  • Receive contextual answers with file citations

The system is designed using an event-driven serverless architecture on AWS.


Features

Repository Indexing

  • GitHub repository cloning
  • Shallow clone optimization
  • Automatic re-indexing support
  • Repository-scoped chunk management

Intelligent Code Parsing

Supports indexing for:

  • Python
  • JavaScript / TypeScript
  • TSX / JSX
  • Java
  • C / C++
  • Go
  • Rust
  • SQL
  • Markdown
  • YAML / JSON

Ignores:

  • node_modules
  • .git
  • build artifacts
  • lock files
  • binaries/media

Semantic Chunking

  • Function/class-aware chunking
  • Metadata tracking
  • File path + line number references
  • Context-preserving segmentation

Vector Search

  • pgvector-powered similarity search
  • Cosine similarity retrieval
  • Semantic repository querying

AI-Powered Responses

  • OpenAI embedding + generation support
  • Offline deterministic embedding fallback
  • Source-aware contextual answers

Frontend

  • Next.js frontend
  • Repository indexing workflow
  • Interactive chat interface
  • Source citation rendering

AWS Architecture

Frontend:

  • S3
  • CloudFront CDN

Backend:

  • API Gateway
  • AWS Lambda
  • SQS queue workers

Database:

  • PostgreSQL RDS
  • pgvector extension

Infrastructure:

  • AWS SAM
  • CloudFormation
  • Docker-based Lambda packaging

Architecture Flow

User → CloudFront → S3 Frontend → API Gateway → Lambda Backend → SQS → Indexer Lambda → PostgreSQL + pgvector


Technical Highlights

  • Serverless-first architecture
  • Event-driven indexing pipeline
  • Infrastructure-as-Code deployment
  • Async repository processing
  • Vector database integration
  • RAG-based retrieval system
  • Production-style AWS deployment

Planned Improvements

  • GitHub OAuth
  • Private repository support
  • Agent workflows
  • Code graph analysis
  • Architecture diagram generation
  • Streaming responses
  • Multi-repository workspaces
  • Incremental re-indexing
  • Authentication and rate limiting

Tech Stack

Frontend:

  • Next.js
  • TypeScript

Backend:

  • FastAPI
  • Python

Infrastructure:

  • AWS Lambda
  • API Gateway
  • SQS
  • CloudFront
  • S3
  • RDS PostgreSQL
  • pgvector
  • AWS SAM

AI / Search:

  • OpenAI API
  • Vector embeddings
  • Semantic retrieval

Deployment

Fully deployed on AWS using Infrastructure-as-Code via AWS SAM and CloudFormation.