Skip to content

Latest commit

 

History

History
228 lines (156 loc) · 5.07 KB

File metadata and controls

228 lines (156 loc) · 5.07 KB

 

Mini-Coding-Agent

This folder contains a small standalone coding agent:

  • code: mini_coding_agent.py
  • CLI: mini-coding-agent

It is a minimal local agent loop with:

  • workspace snapshot collection
  • stable prompt plus turn state
  • structured tools
  • approval handling for risky tools
  • transcript and memory persistence
  • bounded delegation

The model backend is currently based on Ollama.


Stay tuned for a more detailed tutorial to be linked here

 

Requirements

You need:

  • Python 3.10+
  • Ollama installed
  • an Ollama model pulled locally

Optional:

  • uv for environment management and the mini-coding-agent CLI entry point

This project has no Python runtime dependency beyond the standard library, so you can run it directly with python mini_coding_agent.py if you do not want to use uv.

 

Install Ollama

Install Ollama on your machine so the ollama command is available in your shell.

Official installation link: ollama.com/download

Then verify:

ollama --help

Start the server:

ollama serve

In another terminal, pull a model. Example:

ollama pull qwen3.5:4b

Qwen 3.5 model library:

The default in this project is qwen3.5:4b. If you have sufficient memory, it is worth trying a larger model such as qwen3.5:9b or another larger Qwen 3.5 variant. The agent just sends prompts to Ollama's /api/generate endpoint.

 

Project Setup

Clone the repo or your fork and change into it:

git clone https://github.com/rasbt/mini-coding-agent.git
cd mini-coding-agent

If you forked it first, use your fork URL instead:

git clone https://github.com/<your-github-user>/mini-coding-agent.git
cd mini-coding-agent

 

Basic Usage

Start the agent:

cd mini-coding-agent
uv run mini-coding-agent

Without uv, run the script directly:

cd mini-coding-agent
python mini_coding_agent.py

By default it uses:

  • model: qwen3.5:4b
  • approval: ask

For a concrete usage example, see EXAMPLE.md.

 

Approval Modes

Risky tools such as shell commands and file writes are gated by approval.

  • --approval ask prompts before risky actions (default and recommended)
  • --approval auto allows risky actions automatically (convenient but riskier)
  • --approval never denies risky actions

Example:

uv run mini-coding-agent --approval auto

 

Resume Sessions

The agent saves sessions under the target workspace root in:

.mini-coding-agent/sessions/

Resume the latest session:

uv run mini-coding-agent --resume latest

Resume a specific session:

uv run mini-coding-agent --resume 20260401-144025-2dd0aa

 

Interactive Commands

Inside the REPL, slash commands are handled directly by the agent instead of being sent to the model as a normal task.

  • /help shows the list of available interactive commands
  • /memory prints the distilled session memory, including the current task, tracked files, and notes
  • /session prints the path to the current saved session JSON file
  • /reset clears the current session history and distilled memory but keeps you in the REPL
  • /exit exits the interactive session
  • /quit exits the interactive session; alias for /exit

 

Main CLI Flags

uv run mini-coding-agent --help

Without uv:

python mini_coding_agent.py --help

CLI flags are passed before the agent starts. Use them to choose the workspace, model connection, resume behavior, approval mode, and generation limits.

Important flags:

  • --cwd sets the workspace directory the agent should inspect and modify; default: .
  • --model selects the Ollama model name, such as qwen3.5:4b; default: qwen3.5:4b
  • --host points the agent at the Ollama server URL (usually not needed); default: http://127.0.0.1:11434
  • --ollama-timeout controls how long the client waits for an Ollama response (usually not needed); default: 300 seconds
  • --resume resumes a saved session by id or uses latest; default: start a new session
  • --approval controls how risky tools are handled: ask, auto, or never; default: ask
  • --max-steps limits how many model and tool turns are allowed for one user request; default: 6
  • --max-new-tokens caps the model output length for each step; default: 512
  • --temperature controls sampling randomness; default: 0.2
  • --top-p controls nucleus sampling for generation; default: 0.9

 

Example

See EXAMPLE.md

 

Notes & Tips

  • The agent expects the model to emit either <tool>...</tool> or <final>...</final>.
  • Different Ollama models will follow those instructions with different reliability.
  • If the model does not follow the format well, use a stronger instruction-following model.
  • The agent is intentionally small and optimized for readability, not robustness.