Skip to content

azena-ai/self-improving-loop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

self-improving-loop

A genome-driven pattern for autonomous agents that rewrite their own instructions.

Battle-tested in production. Built by azena — an AI boutique from Stuttgart.


Most "autonomous agents" are one prompt in a while loop. They drift, repeat the same mistakes, and keep no memory of what they learned. This is a different pattern.

The agent runs on a genome — a versioned strategy file it both reads and rewrites. Every cycle it ships one real unit of work, verifies it, and folds what it learned back into its own instructions. Over days, the loop gets better at its own job without you touching it.

It is not a framework. It is four markdown files and a discipline. Drop them into Claude Code (or any agent that can schedule its own next turn) and go.

The loop in one picture

flowchart TD
  A([Tick fires]) --> B[Read genome + levers + lessons]
  B --> C[Pick the single highest-value lever]
  C --> D[Build / act — small, shippable]
  D --> E{Gate: verify it really works}
  E -- fail --> C
  E -- pass --> F[Commit]
  F --> G[Self-improve:<br/>rewrite genome, append a lesson, bump version]
  G --> H[Schedule the next tick]
  H --> A
Loading

No human in the inner loop. A human sets the mission and reviews the diffs.

The four parts

File Role
genome.md The evolving strategy + state. Mission, current focus, what's proven, what's next. The loop mutates this as it learns. Versioned (v001 → v002 …).
loop-prompt.md The orchestrator. The prompt each tick reads and executes — and improves. It contains the tick cycle, the gate rules, and the accumulated lessons.
levers.md The prioritized backlog. A ranked list of moves, with a status log. The loop always takes the top open one.
lessons Hard-won rules, written back into the prompt the moment they're learned. This is the "self-improving" part — see docs/lessons.md.

Why it actually works

  • The genome is memory. State lives in a file the next tick reads — not in a context window that evaporates. The loop survives compaction, restarts, and days of wall-clock time.
  • Self-mutation beats a static prompt. When the mission shifts, the loop rewrites its own genome (v001 → v002) instead of fighting stale instructions.
  • Gates make autonomy safe. Nothing ships unverified. The discipline below is what separates "autonomous" from "reckless."
  • One lever per tick. Small, shippable units keep every change reviewable and every failure cheap to roll back.

The non-negotiable: gates

An autonomous loop is only as trustworthy as its verification. Before any commit, the loop must prove the work, not assume it:

  • Build and typecheck must be green — and don't let a pipe (| head) swallow the exit code.
  • Verify the artifact, not the log. "Deploy succeeded" ≠ "the page renders." Check the actual output.
  • A failing gate sends the loop back to pick another lever — never forward to "commit anyway."

See docs/gates.md.

Lessons it has already learned

These are real, generalized from production runs. The point of the pattern is that this list grows by itself:

  • Verify before you ship. A build step can fail silently and leave an empty shell. Always assert the output is non-empty and correct before deploying. (why)
  • Don't turn one finding into a destructive sweep. A single odd-looking match is not a mandate for a sitewide find-and-replace. Check intent first. (why)
  • When reality contradicts the task's premise, report — don't blindly execute. If the job says "small fix" and you find a load-bearing rewrite, surface it. (why)
  • Tools that need a server don't start one. Bring the server up, wait for it, then run the check. A flood of connection-refused means "nothing's listening," not "everything's broken." (why)

Get started

  1. Copy the four *.template.md files into your project (drop the .template).
  2. Fill genome.md with your mission and first focus.
  3. Seed levers.md with a ranked backlog.
  4. Hand loop-prompt.md to your agent and tell it to run one tick, then schedule the next.
  5. Review the diffs each morning. Watch the genome evolve.

Works anywhere an agent can schedule its own next turn (e.g. Claude Code's wake-up scheduling) and commit to git.

Skills

Reusable Claude Code skills distilled from this pattern live in skills/:


Built by azena · bespoke, EU-sovereign AI for the German Mittelstand · Stuttgart
Kayra Labs GmbH i. G. · Dev Academy · MIT licensed

About

A genome-driven pattern for autonomous agents that rewrite their own instructions. Battle-tested. Templates + docs + Claude skills.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors