NeuroGraph LIVE transforms how students learn by converting fragmented textbook knowledge into interactive 3D mind maps with a real-time AI tutor that can see, hear, and teach โ all deployable in VR with just a Google Cardboard headset.
๐ฌ Manim Animation Engine โ AI generates & narrates custom educational videos in real-time |
โณ Intelligent Loading โ Hand gesture tips while the AI builds your graph |
Many students suffer from "fragmented learning" โ collecting facts without understanding how concepts interconnect. Traditional chatbots provide linear answers, but real understanding needs spatial context.
A spatial knowledge navigator that creates a visual digital twin of knowledge, where:
| ๐ Connect the Dots | ๐ฃ๏ธ Learn by Speaking | ๐ฅฝ Immerse Yourself |
|---|---|---|
| See literal links between "Neural Networks" โ "Gradient Descent" โ "Backpropagation" | Talk naturally & show textbook pages to your AI tutor via camera | Enter VR mode and navigate your knowledge graph with hand gestures |
Concepts are semantically clustered using Vertex AI embeddings and projected into 2D/3D space using UMAP dimensionality reduction. Click any node to explore its connections.
Desktop View โ Graph with semantic clustering, text input, and file upload |
Node Exploration โ Click to highlight connections with color-coded edges |
The core of NeuroGraph is a bidirectional streaming connection to Gemini's Multimodal Live API. The AI tutor can simultaneously:
- ๐ค Hear you โ Real-time speech recognition via native audio streaming
- ๐๏ธ See you โ Camera frames sent as visual "heartbeats" every 2 seconds
- ๐ Read your textbook โ Point your camera at a page and the AI extracts & maps concepts
- ๐บ๏ธ Build your graph โ Proactively appends new nodes/edges to your existing knowledge map
The AI tutor analyzes camera input, generates a Manim animation for "Neural Networks," and narrates it live.
VR Features:
- โ Google Cardboard compatible stereoscopic rendering
- โ Hand gesture navigation (swipe through nodes without controllers)
- โ Camera-based hand detection using the device's front camera
- โ VR-optimized overlays for quizzes, videos, and mind maps
Ask "Show me how this works" and the system:
- Gemini 2.5 Pro writes a custom Manim (Python) animation script
- The backend renders the animation to MP4 in real-time
- The video is streamed back and narrated live by the AI tutor
- Stored in Google Cloud Storage for future playback
NeuroGraph's AI tutor powered by Gemini supports multilingual conversations:
- ๐ฃ๏ธ Speak in any language โ the tutor understands and responds naturally
- ๐ Generate concept graphs in your preferred language
- ๐ Break language barriers in education โ accessible worldwide
๐ Architecture Breakdown (Click to expand)
| Layer | Technology | Purpose |
|---|---|---|
| Edge Layer | Google Cloud Load Balancer | Traffic distribution & SSL termination |
| Compute Layer | Cloud Run (Auto-scaling 0โ3) | Serverless containers for frontend & backend |
| DevOps Layer | Docker + Artifact Registry + gcloud CLI | CI/CD pipeline & container management |
| AI Intelligence | Gemini 2.5 Flash (Live API) | Real-time bidirectional audio/vision streaming |
| AI Reasoning | Gemini 2.5 Pro | Complex Manim script generation |
| Vector Search | Vertex AI text-embedding-004 |
Semantic embedding for concept clustering |
| Data Layer | Google Cloud Storage + ChromaDB | MP4 storage & vector database |
| Frontend | React + Vite + D3.js | Interactive 3D knowledge graph UI |
| Backend | Python + FastAPI + WebSocket | Multimodal Hub & Manim engine |
This project is built end-to-end on the Google Cloud AI ecosystem:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ง AI MODELS โ
โ โโ Gemini 2.5 Flash (Multimodal Live API) โ
โ โ โโ Real-time bidi audio + vision streaming โ
โ โโ Gemini 2.5 Pro โ
โ โ โโ High-reasoning Manim animation generation โ
โ โโ text-embedding-004 (Vertex AI) โ
โ โโ Semantic vector embeddings for concept clustering โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ๏ธ INFRASTRUCTURE โ
โ โโ Cloud Run โ Serverless containers (frontend + โ
โ โ backend with auto-scaling) โ
โ โโ Artifact Registry โ Docker image management โ
โ โโ Cloud Storage โ Persistent MP4 video storage โ
โ โโ Secret Manager โ Secure API key management โ
โ โโ Cloud Build โ Automated container builds โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ ๐ฅ๏ธ APPLICATION โ
โ โโ Frontend: React + Vite + D3.js + WebRTC โ
โ โโ Backend: Python + FastAPI + WebSocket + Manim โ
โ โโ Database: ChromaDB (Vector Store) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Both frontend and backend are live on Google Cloud Run with full observability. Here is proof of our production deployment:
Both frontend and backend services are healthy and running in us-central1
WebSocket connections to Gemini Live successfully established |
Request count, latency, and end-to-end performance metrics |
Nginx serving the React SPA with proper routing |
Sub-100ms latency serving static assets via Cloud Run CDN |
Vertex AI API key configured for Gemini 2.5 Flash & Pro models
| # | Insight | Impact |
|---|---|---|
| 1 | Sub-second visual-to-audio latency with Gemini 2.5 Flash Live API | Makes the AI tutor feel human โ critical for engagement |
| 2 | Visual heartbeats (2-sec camera snapshots) build context proactively | AI suggests learning paths without the user asking "What is this?" |
| 3 | Spatial mapping via UMAP of embeddings reduces cognitive load | Learners visualize conceptual "distance" โ solving the "where was I?" problem |
| 4 | Manim + Gemini Pro combo enables on-the-fly educational animations | No pre-rendered content needed โ every explanation is unique |
| 5 | Hand gesture detection via camera makes VR accessible | No expensive controllers โ just a phone and Google Cardboard |
Experience the full depth of NeuroGraph LIVE with these multimodal prompts:
"I'm looking at this page about Gradient Descent. Can you read this and add it to our map?"
What happens: The AI Tutor analyzes your camera feed, extracts key concepts, and calls create_mind_map. It explains the new nodes and their connections to your existing graph.
"Can you explain this mind map in full depth? Show me how 'Backpropagation' connects to 'Neural Networks' and why it matters."
What happens: The AI Tutor performs a semantic deep-dive, describing relationships and links qualitatively, helping you synthesize the entire topic.
"This concept is abstract. Can you generate an animation explaining how the weights are updated?"
What happens: The AI triggers the Manim Engine via generate_video. You receive a custom educational animation with live narration from your tutor.
"ยฟPuedes explicarme las redes neuronales en espaรฑol?" / "เฆจเฆฟเฆเฆฐเฆพเฆฒ เฆจเงเฆเฆเฆฏเฆผเฆพเฆฐเงเฆ เฆธเฆฎเงเฆชเฆฐเงเฆเง เฆฌเฆพเฆเฆฒเฆพเฆฏเฆผ เฆฌเฆฒเง"
What happens: The AI responds fluently in the requested language, generating concept maps with localized labels.
- Python 3.12+, Node.js 20+, FFmpeg (for Manim)
- A Google Cloud project with Gemini API access
git clone https://github.com/your-repo/neurograph-live.git
cd neurograph-live
# Backend
cd app
pip install -r requirements.txt
# Create .env with your GEMINI_API_KEY
echo "GEMINI_API_KEY=your_key_here" > .env
uvicorn main:app --reloadcd frontend
npm install
npm run devWe provide a fully automated deployment script that handles everything:
chmod +x deploy-gcp.sh
./deploy-gcp.shThe script automatically:
- โ Enables required Google APIs (Cloud Run, Cloud Build, Artifact Registry)
- โ Creates an Artifact Registry Docker repository
- โ Builds & deploys the Backend to Cloud Run
- โ
Injects the Backend URL into
VITE_API_URL - โ Builds & deploys the Frontend to Cloud Run
- โ Returns your live Frontend URL ๐
neurograph-live/
โโโ app/ # ๐ Python Backend (FastAPI)
โ โโโ main.py # WebSocket server & API routes
โ โโโ gemini_live_agent.py # Gemini 2.5 Flash Live integration
โ โโโ manim_generator.py # Dynamic Manim animation engine
โ โโโ requirements.txt # Python dependencies
โ โโโ Dockerfile # Backend container
โ โโโ .gcloudignore
โโโ frontend/ # โ๏ธ React Frontend (Vite)
โ โโโ src/
โ โ โโโ App.jsx # Main application
โ โ โโโ components/ # UI components
โ โ โโโ index.css # Styles
โ โโโ Dockerfile # Frontend container
โ โโโ package.json
โโโ Img/ # ๐ผ๏ธ Screenshots & assets
โโโ GCP/ # โ๏ธ GCP deployment proof
โโโ deploy-gcp.sh # ๐ One-click deployment script
โโโ docker-compose.yml # ๐ณ Local multi-service setup
โโโ LICENSE # MIT License
This project is licensed under the MIT License. See the LICENSE file for details.
NeuroGraph LIVE
Moving education from static chat to immersive spatial exploration.
Built with โค๏ธ using Gemini 2.5 & Google Cloud
for the Gemini Live Agent Challenge












