Skip to content

offdev/micro-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

micro-agent-go

A small Go agent that talks to a local llama.cpp server (OpenAI-compatible HTTP API), runs a tool-using loop, and can accept input from a terminal CLI, Telegram, or a periodic cron “heartbeat”. This is a hobby project; do not rely on it for production or security-sensitive workloads.


Features (overview)

  • LLM backendllamacpp provider: completions and streaming, including optional reasoning/thinking deltas when the server exposes them.
  • Supervisor + sub-agents — Main agent can delegate via spawn_agent to a pool of specialised agents loaded from ~/.micro-agent/agents, with LLM-based routing.
  • Tools — Filesystem (list_dir, read_file, write_file, append_file, edit_file), optional shell_exec, long-term memory tools (memory_save, memory_search, memory_delete) when a store is available, HTTP web_fetch, and Browserless-backed browser_search / browser_content when BROWSERLESS_URL is set. telegram_send is registered when TELEGRAM_BOT_TOKEN is set.
  • Long-term memory — Vector store backed by Milvus (embeddings from the same llama-server). If embeddings are disabled (memory.embed / MEMORY_EMBED) or Milvus is unreachable, the agent still runs with session memory only (in-process session store).
  • Channels — Interactive readline CLI (with streamed thinking/reply styling and /attach <path> to queue document files for the next message), Telegram long polling (user document messages are downloaded and turned into text attachments), HTTP channel with embedded browser chat UI (JSON or multipart uploads, SSE streaming), and optional cron channel that injects periodic ticks from a heartbeat file (uses SQLite for tick state).
  • Session handling — Per-channel session keys, optional conversation tree on the CLI for branching sessions.
  • Context control — Configurable message limits and compaction (truncate or summarize against a token threshold).
  • Logging-v / -vv / -vvv verbosity; optional log file; daemon mode can log to stderr and file together.
  • Safety toggles--safe (or --no-fs, --no-web, --no-spawn) to strip destructive filesystem tools, browser/fetch tools, and sub-agent spawning.

More detail lives under docs/ (config, channels, memory, tools, multi-agent, core).


Requirements

  • Go 1.25+ (see go.mod).
  • llama-server (or compatible OpenAI-style server) reachable at the URL in config (default http://localhost:8080).
  • Long-term memory (optional) — Running Milvus and memory.embed: true with a valid milvus_addr (see examples/config.json). Omit or disable embeddings for a simpler, session-only setup.
  • Browserless (optional) — For JS-rendered search/content; compose includes a browserless service.

Manual installation

git clone https://github.com/offdev/micro-agent.git
cd micro-agent
go build -o ua ./cmd/ua

Install the binary wherever you prefer (e.g. mv ua ~/bin/). Ensure llama-server is running and points at your model.

Configuration

  • Default config path: $UA_CONFIG or ~/.micro-agent/config.json. Missing file is OK; environment variables override file values.
  • Copy and edit examples/config.json as a starting point.
  • Optional prompts: SYSTEM.md next to your workdir parent (~/.micro-agent/SYSTEM.md by default), AGENTS.md under the workdir, and per-agent definitions under ~/.micro-agent/agents/.

Important environment variables (see also comments in cmd/ua/main.go):

Variable Role
LLAMA_URL Base URL of llama-server
WORKDIR Process working directory for tools
MEMORY_EMBED false to skip embeddings / long-term memory store
MILVUS_ADDR Milvus gRPC address (e.g. localhost:19530)
TELEGRAM_BOT_TOKEN Enables Telegram channel + tool
HTTP_CHANNEL_ENABLED Enables HTTP channel + embedded chat UI
HTTP_CHANNEL_LISTEN HTTP listen address (default 127.0.0.1:8765)
HTTP_CHANNEL_TOKEN Optional shared secret (X-UA-Token) for /api/chat
BROWSERLESS_URL Browserless HTTP URL for browser tools
CRON_ENABLED Enables heartbeat channel (often with --daemon)

Running locally

From the repository root after go build:

./ua
  • With a TTY, you get the CLI by default. Use --daemon for headless mode (requires another channel such as Telegram or cron, or the process will exit with “no channels configured”).
  • Use --interactive to force the CLI when stdin is not a TTY.
  • For browser chat, set HTTP_CHANNEL_ENABLED=true and open http://127.0.0.1:8765 (or your configured listen address).

HTTP chat UI behavior

  • Streaming output uses SSE events (thinking, delta, error, done) in arrival order.
  • Long lines and long tokens wrap cleanly; horizontal scrolling is disabled.
  • Input controls: Enter sends the message, Shift+Enter inserts a newline.
  • Response whitespace is preserved (including leading spaces in streamed deltas).
  • Attachments — “Attach documents” sends multipart POST /api/chat with session_id, message, and repeated file parts (plain text, PDF, images for vision, and other types per docs/channel.md). With no files selected, the UI uses JSON as before.
  • If the HTTP channel is protected with HTTP_CHANNEL_TOKEN, set localStorage.ua_http_token in the browser devtools (or equivalent) so requests include X-UA-Token.

CLI attachments

  • Run /attach /path/to/file (quote paths with spaces) one or more times, then send your message on the next line. Pending paths are applied to that line only. Errors for a given file go to stderr.

Safer exploration (read-only style: no writes/edits/shell, no browser/fetch, no sub-agents):

./ua --safe

Verbosity: -v, -vv, -vvv.


Docker (compose)

The stack is defined in docker-compose.yml at the repository root. It builds the agent from deploy/Dockerfile, pulls up Milvus (etcd, minio, standalone), optional Browserless, optional Attu, and runs the ua service with network_mode: host so the agent can reach llama-server and Milvus on localhost alongside the host.

Before first run on the host:

mkdir -p ~/.micro-agent
cp examples/config.json ~/.micro-agent/config.json
# Edit milvus_addr, model, URLs as needed.
touch ~/.micro-agent/SYSTEM.md   # or add real content; compose mounts it read-only

Build and start (from repo root):

docker compose up --build
  • llama-server is not in this compose file — run it on the host (or elsewhere) and set LLAMA_URL in the ua service environment if it is not http://localhost:8080.
  • The ua service mounts ${HOME}/.micro-agent into the container so workdir, DB paths, and state stay on the host; config and SYSTEM.md are mounted read-only as in the compose file.

Running more safely in Docker — Prefer a restricted tool set and avoid mounting sensitive host paths beyond a dedicated agent directory. You can override the container command, for example:

docker compose run --rm ua ./ua --interactive --safe

(Adjust flags for --daemon + Telegram/cron if you do not use an interactive TTY.)

Because the optional stack uses host networking and privileged-adjacent services, treat this as local experimentation only, not an isolation boundary for untrusted code.


Agent loop

Execution is centred on internal/core.Agent.Run. The application wraps a supervisor (internal/multiagent) around that same loop and feeds it messages from channels (internal/app).

1. Setup

  • If Instructions (system prompt) is non-empty and the conversation does not already start with a system message, the system prompt is prepended.
  • A name → tool map is built once per Run for lookups.
  • The callback chain runs BeforeAgentLoop (e.g. logging hooks when verbosity is enabled).

2. Iteration (repeat until the model stops requesting tools)

Each loop iteration:

  1. BeforeLLMCall — Callbacks may transform the message list sent to the model.
  2. Tool definitions — Current tools are serialized to the provider once for this turn.
  3. LLM call — Either:
    • Streaming — If a streamConsumer is set (CLI/Telegram path), the provider’s Stream is used; each Delta can carry incremental text, a Thinking flag for reasoning-only chunks, and a terminal Done with Final holding the full assistant turn (content + tool calls). Only the final assistant content is persisted; thinking streams are for display only.
    • Non-streamingComplete returns a single Response.
  4. Empty-response handling — If the model returns no text and no tool calls, the loop injects a short synthetic user nudge and retries, up to a small fixed number of times; otherwise Run returns an error (empty turns are not appended).
  5. AfterLLMCall — Callbacks observe the response; errors propagate.
  6. Persist assistant message — The assistant message (content + tool_calls, if any) is appended to the conversation.
  7. Exit if no tools — If there are no tool calls, the outer loop ends.
  8. Tool execution — Otherwise, each tool call is processed concurrently (goroutines + WaitGroup):
    • BeforeToolExecution may adjust or reject the call.
    • Unknown tools produce an error result string; known tools run Execute.
    • AfterToolExecution runs for logging/metrics; errors from this hook are ignored by design.
  9. Tool results — For each call, a tool role message is appended (content or error: …), linked by tool_call_id and name.

3. Teardown

  • After the loop finishes, AfterAgentLoop runs on the callback chain.

How the app uses this

  • SessionsSessionStore holds per-session Conversation state; each inbound message appends a user message, then Supervisor.Run (same as Agent.Run) runs the loop.
  • Compaction — After a successful run, if strategy is summarize and estimated tokens exceed the threshold, the app may replace the conversation with a summarised version before storing it back.
  • Sub-agents — The supervisor’s agent has spawn_agent; sub-agents in the pool do not get spawn_agent, avoiding unbounded recursion.

License

See LICENSE.

About

Small, custom AI agent built in go

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors