An advanced decentralized multi-agent orchestration framework built using Google's Agent Development Kit (ADK) and the Agent-to-Agent (A2A) protocol. This system coordinates a team of specialized AI microservice agents to research, judge, and compile comprehensive course modules based on user requests.
It leverages a hybrid intelligence model, pairing Gemini's massive context reasoning for planning/judging with a self-hosted Gemma model (via Ollama on Google Cloud Run + GPU) for high-performance content generation.
This project is built as a set of containerized microservices running on Google Cloud Run, communicating securely using service-to-service authentication.
graph TD
User([User Web UI]) <--> |HTTP / WebSockets| Orch[Orchestrator Service]
subgraph Multi-Agent Collaboration Loop
Orch <--> |A2A Protocol| Research[Researcher Agent <br> Gemini 3 Flash + Google Search]
Orch <--> |A2A Protocol| Judge[Judge Agent <br> Gemini 3 Flash]
Research -.-> |Returns Findings| Orch
Orch -.-> |Evaluates Findings| Judge
Judge -.-> |Pass / Fail Feedback| Orch
end
Orch ----> |Generate Content| CB[Content Builder Agent <br> Gemma 3 via LiteLLM]
CB <--> |Local Network| Ollama[Ollama GPU Backend <br> Nvidia L4 GPU / Cloud Run]
The system is composed of 6 microservices:
- Frontend App (
course-creator): User-facing UI to submit topics and view live agent execution. - Orchestrator Service (
orchestrator): Main engine that manages the sequential/loop flow using ADK'sLoopAgent. - Researcher Agent (
researcher): Searches Google for information on the topic and compiles details. - Judge Agent (
judge): Strictly evaluates the researcher's findings against quality criteria, providing structured feedback. - Content Builder Agent (
content_builder): Transforms the approved research into markdown course modules using a self-hosted Gemma 3 model. - Ollama GPU Backend (
ollama-gemma-gpu): Serves Gemma 3 locally inside Cloud Run with dedicated NVIDIA L4 GPU acceleration.
Here is the exact step-by-step lifecycle of a course generation task:
[ User Input ] ──> ( Course Creator UI )
│
▼
( Orchestrator Service )
│
┌───────────────────┴───────────────────┐
▼ ▼
[ Start Loop ] [ Max Iterations Met or Pass ]
│ │
▼ ▼
( Researcher Agent ) ( Content Builder Agent )
├── Gathers Google Search data ├── Reads approved research
└── Sends findings to Orchestrator └── Generates course using Gemma 3
│ │
▼ ▼
( Judge Agent ) ( Finished Course Output )
├── Inspects findings completeness └── Delivered back to User
└── Outputs: 'pass' or 'fail' + feedback
│
└─► If 'fail' ──► Loop starts again
- User Initiation: The user enters a topic in the UI (e.g., "Introduction to Rust Programming").
- Orchestrator Kickoff: The Orchestrator spins up the collaboration loop.
- Information Gathering: The Researcher uses the Google Search tool to find relevant articles and outlines.
- Quality Assurance: The Judge evaluates the research. If details are missing or shallow, the Judge flags a
failand returns constructive feedback. - Refinement (Loop): The Researcher receives the feedback and initiates a new targeted Google search to fill the gaps. This iterates up to 3 times.
- Writing & Compilation: Once the Judge outputs
pass, the research is handed to the Content Builder, which queries the self-hosted Gemma 3 model to write the final detailed course sections. - Delivery: The completed course is rendered to the user in clean Markdown.
multi_agent_system/
├── README.md # Project Overview & Workflow (This file)
└── multi-agent-system/ # Active project directory
├── deploy.sh # Cloud Run deploy script for all 6 services
├── run_local.sh # Local development launcher script
├── pyproject.toml # Root dependency configuration (uv)
├── ollama-backend/ # Self-hosted Gemma GPU service Dockerfile
├── app/ # Web Frontend application
├── shared/ # Shared utilities (a2a_utils, adk_app, auth)
└── agents/
├── orchestrator/ # Main coordinator agent
├── researcher/ # Information gatherer agent (Gemini 3 Flash)
├── judge/ # Evaluator/editor agent (Gemini 3 Flash)
└── content_builder/ # Writer agent (self-hosted Gemma 3)
- Python >= 3.10
- uv (Fast Python package manager)
- Google Cloud SDK (For Vertex AI authentication)
- Ollama (For running Gemma locally)
-
Clone the repository and navigate to the project directory:
cd multi-agent-system -
Authenticate Google Application Default Credentials (ADC):
gcloud auth application-default login
-
Install all dependencies:
uv sync
-
Start Ollama and download the Gemma 3 model:
ollama serve # In a new terminal window: ollama pull gemma3:270m -
Launch the local multi-agent system:
./run_local.sh
-
Open your browser and visit
http://localhost:8000.
The system utilizes automated Cloud Build pipelines to deploy all services.
-
Set your active Google Cloud project:
gcloud config set project YOUR_PROJECT_ID -
Enable the required GCP APIs:
gcloud services enable run.googleapis.com cloudbuild.googleapis.com aiplatform.googleapis.com -
Run the deployment script:
./deploy.sh
Note: The script automatically handles service-to-service IAM permissions and environment configuration, outputting the URL of your live Course Creator web app once finished.