Skip to content

spec: Multi-Agent Architecture + Small Business Agent Team#679

Open
kovtcharov wants to merge 4 commits intomainfrom
spec/multi-agent-architecture-clean
Open

spec: Multi-Agent Architecture + Small Business Agent Team#679
kovtcharov wants to merge 4 commits intomainfrom
spec/multi-agent-architecture-clean

Conversation

@kovtcharov
Copy link
Copy Markdown
Collaborator

Summary

Two architecture specifications defining GAIA's evolution from a monolithic ChatAgent to a multi-agent system.

1. Multi-Agent Architecture (docs/spec/multi-agent-architecture.md)

  • GaiaAgent (0.6B on NPU) — personality-driven orchestrator, fun to talk to, narrates progress
  • CodeAgent (4B+) — agent factory, builds domain-specific agents on demand using full GAIA codebase
  • Platform agents (Doc, File, Shell, Web) — standard library, always available
  • Agent MCP Server — all-to-all communication bus with task dependencies, agent spawning, agents-as-tools
  • Shared memory — per-agent namespaces, read-any/write-own, event-based notifications, typed schemas
  • Agent UI — multi-agent management platform with per-task windows, human-in-the-loop feedback
  • Production hardening — kill criteria, semantic checkpointing, deadlock detection, confidence-based HITL, memory slicing, per-agent context budgets

Based on research across CrewAI, LangGraph, AutoGen, OpenAI Agents SDK, Google ADK, Anthropic Agent Teams, AgentSpawn, and NVIDIA ToolOrchestra.

2. Small Business Agent Team (docs/spec/small-business-agent-team.md)

First application built on the multi-agent architecture — proof that the platform works for complex, real-world, multi-agent workflows.

  • GaiaAgent interviews user about their business
  • CodeAgent builds tailored team (FormationAgent, ComplianceAgent, FinanceAgent, etc.)
  • Dynamic team assembly — not templates
  • All agents fully autonomous with scheduled execution
  • Task dependencies, inter-agent communication, shared workspace

Related issues: #674, #675, #676, #677, #616, #666, #667, #668, #612, #542, #543

Test plan

  • Specs only — no code changes
  • Architecture review: consistency with existing GAIA patterns
  • Feasibility review: 0.6B GaiaAgent, CodeAgent building agents

Multi-Agent Architecture (docs/spec/multi-agent-architecture.md):
- GaiaAgent (0.6B NPU) as personality-driven orchestrator
- CodeAgent as agent factory — builds domain-specific agents on demand
- Platform agents (Doc, File, Shell, Web) as standard library
- Agent MCP Server for all-to-all communication with task dependencies
- Shared memory with per-agent namespaces (read-any, write-own)
- Agent UI as multi-agent management platform with full observability
- Production hardening: kill criteria, semantic checkpointing, typed schemas,
  confidence-based HITL, memory slicing, deadlock detection, event notifications

Small Business Agent Team (docs/spec/small-business-agent-team.md):
- First application built on the multi-agent architecture
- GaiaAgent interviews user, CodeAgent builds tailored team
- Dynamic team assembly (not templates)
- Task dependencies, inter-agent communication, shared workspace

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the documentation Documentation changes label Mar 30, 2026
kovtcharov and others added 2 commits March 30, 2026 14:55
Simplifications:
- Agent-to-agent comms: shared SQLite state, not MCP protocol.
  MCP stays for external tools/clients. Internal = just DB reads/writes.
- Two agent tools (create_task + ask_agent), not four overlapping variants
- Drop confidence scoring (0.6B can't self-assess reliably). Use
  deterministic tool-level authorization instead.
- Drop Pydantic schemas for inter-agent data. Agents return plain text,
  critical facts stored as key-value pairs in memory.
- Drop event-based memory subscriptions. Agents read on task start.
- Domain guardrails are prompt instructions, not a DomainGuardrails dataclass.

Kept:
- Kill criteria (8 max iterations, stuck detection, deadlock detection)
- Semantic checkpointing (critical for scheduled agents)
- Memory slicing on spawn (context efficiency)
- Per-agent context budgets
- Preference learning from user corrections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…aling

Model upgrade:
- GaiaAgent: 0.6B → Qwen3.5-4B (realistic for orchestration + personality)
- All platform agents share same 4B base with LoRA adapters (~8GB total)
- 16K context per agent (vs 4K for 0.6B)

New sections:
- Adaptability (§12): how agents handle changing circumstances
  (context changes via memory, spawn new agents, rebuild for pivots)
- Reliability (§13): honest constraints of small LLMs with mitigations
  (output validation, retry with fallback, constrained format, graceful degradation)
- Known Limitations (§14): single-user, no real-time streams, cold start, auth gaps

Memory scaling:
- No summarization — information loss is unacceptable
- Memory IS the context — database is unlimited, context window is a search view
- FTS5 retrieval loads only task-relevant memories into context
- Nothing deleted, compressed, or lost

Simplifications kept:
- 2 agent tools (create_task + ask_agent)
- Shared SQLite for inter-agent comms (not MCP)
- Guardrails as prompt instructions (not dataclass)
- Deterministic authorization (not confidence scoring)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removed:
- "Why X?" sections that argue for self-evident decisions
- Repeated CodeAgent examples (3 examples saying the same thing → 1 table)
- Pydantic schemas, pub/sub subscriptions, confidence scoring, DomainGuardrails dataclass
- Over-justified MCP rationale, template arguments, resilience motivation

Added back (lost in initial cut):
- Live progress narration example (core UX)
- Collective intelligence examples (cross-agent memory)
- Agent UI ASCII wireframe (helps engineers visualize)
- Retry/fallback chain (critical reliability detail)
- Observability section with traceability detail (tool calls, inter-agent
  messages, dependencies, reasoning, memory writes all visible to user)
- Conversational approval flow example

Every remaining line either defines architecture, shows a concrete example,
or documents a constraint. No justification bloat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread docs/spec/small-business-agent-team.md
Copy link
Copy Markdown
Collaborator

@antmikinka antmikinka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'multi-agent architecture.md' and the 'small-business-agent-team.md' document sounds good. Like the concepts.

@kovtcharov kovtcharov enabled auto-merge April 20, 2026 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Documentation changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants