Skip to content

Evil-Null/copilot-elite-system

Repository files navigation

Elite Agent System

Production-grade multi-agent orchestration framework for GitHub Copilot CLI.

License Copilot CLI Agents Skills


Overview

A specialized multi-agent system that enforces Principal Software Engineer standards through the GitHub Copilot CLI. Tasks are decomposed and routed to dedicated agents — each with constrained tools, a focused model, and a single responsibility — coordinated by a master orchestrator through defined pipelines.

Key principles:

  • No agent modifies code AND approves it
  • Every production change passes through ≥2 agents
  • Security findings at CRITICAL severity halt all pipelines
  • The operator is always the final authority

Architecture

graph TD
    O["🎯 ORCHESTRATOR<br/><sub>claude-opus-4.6 · coordinator</sub>"]

    O --> A["🏗️ ARCHITECT<br/><sub>opus · R/O</sub>"]
    O --> D["💻 DEVELOPER<br/><sub>codex · R/W/X</sub>"]
    O --> S["🔒 SECURITY<br/><sub>opus · R/X</sub>"]
    O --> Q["🧪 QA<br/><sub>gpt-5.4 · R/W</sub>"]
    O --> R["📊 REVIEWER<br/><sub>opus · R/O</sub>"]
    O --> RE["🔬 RESEARCH<br/><sub>opus · R/Web</sub>"]

    style O fill:#1a1a2e,stroke:#e94560,color:#fff,stroke-width:2px
    style A fill:#16213e,stroke:#0f3460,color:#fff
    style D fill:#16213e,stroke:#0f3460,color:#fff
    style S fill:#16213e,stroke:#0f3460,color:#fff
    style Q fill:#16213e,stroke:#0f3460,color:#fff
    style R fill:#16213e,stroke:#0f3460,color:#fff
    style RE fill:#16213e,stroke:#0f3460,color:#fff
Loading

Pipelines:

Pipeline Flow Use Case
A (Full) architect → developer → security → qa → reviewer New features
B (Quick) developer → reviewer Minor fixes
C (Security) security → developer → security → reviewer Security-critical
D (Research) research → architect → developer → qa → reviewer Investigation-first
E (Hotfix) developer → security → reviewer Emergency fixes
F (Refactor) architect → developer → qa → reviewer Structural changes

Agent Registry

Agent Model Role Write Access
orchestrator claude-opus-4.6 Pipeline coordination & delegation No
architect claude-opus-4.6 System design & blast radius analysis No
developer claude-opus-4.6 / gpt-5.3-codex Implementation & V1-V7 verification (dual-model) Yes
security claude-opus-4.6 Vulnerability scanning & STRIDE No
qa gpt-5.4 Test creation & edge case coverage Yes (tests)
reviewer claude-opus-4.6 Final quality gate & ship decision No
research claude-opus-4.6 Deep technical investigation No
general-pro claude-opus-4.6 Complex multi-step tasks (replaces built-in) Yes
task-pro gpt-5.3-codex Build/test execution with brief output Yes
explore-pro gpt-5.4 Codebase exploration & parallel research No

Skills

User-Owned Skills (14)

Skill Type Purpose
code-review Elite Original 5-Eye review procedure with automated checks
security-audit Elite Original Scanning + manual checklist + STRIDE threat modeling
testing Elite Original Testing pyramid, edge case matrix, coverage targets
deploy Elite Original Pre-flight, deploy, post-deploy, rollback procedures
memory Elite Original Session/project/decision memory + ADR across sessions
cross-validation Elite Original Second-pass validation for security-sensitive flows
api-design PHOENIX TIER 1 REST API design patterns (resource naming, pagination, errors)
e2e-testing PHOENIX TIER 1 Playwright E2E testing (Page Objects, CI/CD, artifacts)
search-first PHOENIX TIER 1 Anti-reinvention: search for existing tools before coding
codebase-onboarding PHOENIX TIER 1 Repo onboarding & architecture exploration guide
database-migrations PHOENIX TIER 1 Zero-downtime DB migrations (Prisma, Drizzle, Django, etc.)
blueprint PHOENIX TIER 1 Multi-session plan generator with adversarial review gate
security-scan PHOENIX TIER 1 Agent config security scan (AgentShield)
context-budget PHOENIX TIER 1 Token overhead audit across skills, agents, and rules

Approved ECC Skills — TIER 2 (4)

Skill Purpose
github-ops gh CLI workflows
iterative-retrieval Subagent context refinement
benchmark Performance baselining
strategic-compact Compaction timing strategy

Total: 18 skills available (14 user-owned + 4 TIER 2 from ECC). Governance: config/skill-governance.md

Quick Start

Installation

# Clone
git clone git@github.com:Evil-Null/copilot-elite-system.git
cd copilot-elite-system

# Install to ~/.copilot/
bash install.sh

# Verify
bash ~/.copilot/scripts/memory.sh status
bash ~/.copilot/scripts/guardrails.sh report

Manual Installation

# Create directories
mkdir -p ~/.copilot/{agents,scripts,templates}
mkdir -p ~/.copilot/skills/{code-review,security-audit,testing,deploy,memory,cross-validation,api-design,e2e-testing,search-first,codebase-onboarding,database-migrations,blueprint,security-scan,context-budget}
mkdir -p ~/.copilot/memory/{projects,decisions,sessions}

# Copy files
cp agents/* ~/.copilot/agents/
cp -r skills/* ~/.copilot/skills/
cp scripts/* ~/.copilot/scripts/
cp templates/* ~/.copilot/templates/
cp memory/README.md ~/.copilot/memory/
cp copilot-instructions.md ~/.copilot/

# Set permissions
chmod +x ~/.copilot/scripts/*.sh

First Run

# Initialize memory for your project
bash ~/.copilot/scripts/memory.sh init my-project

# Run through the orchestrator (recommended)
copilot --agent orchestrator -p "Add rate limiting middleware. Use Pipeline A."

# Safe autopilot
bash ~/.copilot/scripts/autopilot.sh --agent developer "Implement feature X"

⚠️ Direct invocation (copilot --agent <name>) bypasses pipeline governance. Use orchestrator for production work.

Automation & Safety

Autopilot Guardrails

Protection Default Override
Max steps 15 --max-steps N
Max files changed 30 --max-files N
Git push Blocked --allow-push
File deletion Blocked --allow-delete
git reset --hard Always blocked No override

CI Templates

Copy to .github/workflows/ in any repository:

Template Trigger Purpose
copilot-review.yml PR Automated code review
copilot-security.yml PR + weekly Security audit
copilot-quality.yml PR + push Quality gate

Cost (Pro+ Plan)

Model Multiplier Used By
claude-opus-4.6 orchestrator, architect, security, reviewer, research, developer (plan/verify/debug)
gpt-5.3-codex developer (code generation)
gpt-5.4 qa

Top-tier models only — Sonnet, Haiku, and previous-gen models are not permitted.

  • Full 5-Eye pipeline (A): ~14-20 premium requests
  • Quick review (B): ~7-9 premium requests
  • All pipeline costs: See REGISTRY.md for Pipelines C, D, E, F estimates

Requirements

  • GitHub Copilot Pro+ subscription
  • Copilot CLI v1.0.23+
  • Bash 4+

ECC Integration (OPERATION CHIMERA)

Everything Claude Code (ECC) v1.10.0 is integrated as an additive capability layer.

graph TD
    subgraph L1["🔴 LAYER 1 — ELITE CORE (AUTHORITATIVE)"]
        L1a["10 agents · 14 skills · 6 pipelines · 6 scripts<br/>copilot-instructions.md = SUPREME"]
    end
    subgraph L2["🟡 LAYER 2 — GOVERNANCE"]
        L2a["hooks/hooks.json — 12 curated hooks<br/>config/ecc-governance.md — 3-tier policy<br/>rules/*.md — 10 lang + 1 common-security"]
    end
    subgraph L3["🟢 LAYER 3 — ECC PLUGIN (ADDITIVE)"]
        L3a["47 agents (ecc: prefix) · 181 skills<br/>READ-ONLY — never modified"]
    end

    L1 --> L2 --> L3

    style L1 fill:#2d0000,stroke:#ff4444,color:#fff,stroke-width:2px
    style L2 fill:#2d2d00,stroke:#ffcc00,color:#fff,stroke-width:2px
    style L3 fill:#002d00,stroke:#44ff44,color:#fff,stroke-width:2px
    style L1a fill:#1a1a2e,stroke:#e94560,color:#fff
    style L2a fill:#1a1a2e,stroke:#e2b714,color:#fff
    style L3a fill:#1a1a2e,stroke:#44ff44,color:#fff
Loading
Capability Count Detail
Curated hooks 12 9 ECC-adapted + 3 custom (model guard, pipeline audit, guardrails)
Approved agents 28 Language reviewers, build resolvers, specialists
Restricted agents 11 Owner-explicit only
Banned agents 8 Irrelevant or redundant
Language rules 10 + common TypeScript, Python, Go, Rust, Java, C#, C/C++, Kotlin, PHP, Shell + common-security baseline

Model enforcement: Only claude-opus-4.6, gpt-5.3-codex, gpt-5.4 — enforced by model-policy-guard.js (exit 1 on banned model).

Governance: config/ecc-governance.md

Skill Integration (OPERATION PHOENIX)

PHOENIX cherry-picked 12 high-value skills from ECC v1.10.0 through a challenge-hardened process (3 independent review agents, 37 findings resolved).

Tier Count Description
Elite Original 6 Core review, security, testing, deploy, memory, cross-validation
PHOENIX TIER 1 8 Adapted from ECC — paths/models/governance remapped
TIER 2 (ECC) 4 Approved as-is from ECC plugin
Total 18 ~22,845 tokens (11.4% of 200K context)

Governance: config/skill-governance.md

Language Rules (OPERATION TITAN)

TITAN expanded language-specific security and style rules from 4 to 10 languages with a shared zero-trust security baseline. Challenge-hardened by 3 independent agents (34 findings resolved).

2-Layer Architecture

rules/
├── common-security.md    ← Shared zero-trust baseline (20 rules, loaded for ALL languages)
├── typescript.md          ← TS/JS: XSS, prototype pollution, child_process
├── python.md              ← Python: pickle, eval/exec, subprocess
├── go.md                  ← Go: text/template XSS, InsecureSkipVerify
├── rust.md                ← Rust: unsafe audit, cargo-audit/cargo-deny
├── java.md                ← Java: ObjectInputStream, SpEL, Log4Shell
├── csharp.md              ← C#: TypeNameHandling, over-posting, Razor XSS
├── cpp.md                 ← C/C++: buffer overflow, use-after-free, ASAN
├── kotlin.md              ← Kotlin: Jackson polymorphic, coroutine safety
├── php.md                 ← PHP: unserialize, type juggling, file inclusion
└── shell.md               ← Shell: eval injection, quoting, ShellCheck
Metric Value
Total files 11 (1 common + 10 language)
Total lines 921
Max per file 96 lines (all ≤150 cap)
Security model Common baseline + language-specific delta (no duplication)
Format Blockquote (> Applies to:) — no YAML frontmatter

Each language file has 4 mandatory sections: Style, Patterns, Security (delta only), Testing. The Security section contains only language-specific vulnerabilities — universal rules live in common-security.md.

Troubleshooting

Problem Solution
copilot command not found Install Copilot CLI: npm i -g @anthropic-ai/copilot-cli
Agent not recognized Run bash install.sh from the copilot-elite-system directory
Lock timeout Stale lock — run ~/.copilot/scripts/memory.sh status to detect
Permission denied on scripts Run chmod +x ~/.copilot/scripts/*.sh
Autopilot timeout Increase with export AUTOPILOT_TIMEOUT=600 (seconds)

Known Limitations

  1. No pipeline enforcement — Agents can be invoked directly (copilot --agent <name>), bypassing orchestrator governance. The system relies on agents self-checking, which is advisory.
  2. Cross-validation is instruction-based — No runtime code detects or enforces cross-validation triggers. Compliance depends on LLM behavior.
  3. Deny-tool patterns are convention — Copilot CLI's deny-tool depends on the CLI honoring patterns. No sandbox enforcement.
  4. Memory is unencrypted plaintext — Do not store secrets in memory files.
  5. No automated pipeline testing — Correctness relies on manual verification and agent self-checks.
  6. Cost estimates are approximate — See agents/REGISTRY.md for details. Actual costs vary by task complexity.
  7. A/B model comparison is manual — No automated framework for comparing model performance.

License

MIT

About

Elite Agent System — Production-grade AI agent governance framework for GitHub Copilot CLI

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors