An autonomous, privacy-first local personal agent powered by Ollama and Gemma. Talk to it from your terminal every model call stays on your machine.
- Local only. Every model call hits
http://localhost:11434(Ollama). No outbound to OpenAI, Anthropic, or Google. - Single user. No auth, no multi-tenancy. Your machine is the trust boundary.
- Transparent state. Sessions, memory, and config are human-readable files (
.json,.md,.env).cat,git diff, andvimwork. - Small, replaceable core. Model client, tool registry, session store, and each channel are one file behind a tiny interface. Swap Ollama for llama.cpp later in a 50-line patch.
- Ollama-backed agent loop with bounded tool-use iteration (default 8 steps/turn).
- Four day-1 tools:
read_file,write_file,exec(shell),fetch_url(optional, network-gated). - JSON session store — one file per session under
~/.hermit/sessions/, atomically written. MEMORY.md— user-editable Markdown file loaded into every system prompt for durable preferences.SOUL.md(optional) — tone/personality overlay.- Workspace containment — file tools refuse paths that escape the configured workspace directory.
- Confirm gate — destructive tools (
write_file,exec) prompt fory/n/alwaysbefore running. - Ollama tool calling with text-fenced fallback for models that don't support native function-calling.
hermit doctor— pings Ollama, validates each enabled channel, prints workspace and state paths.hermit sessions list|show|rm|new— inspect and manage the session log directly.hermit allow|deny <channel> <peer>— allowlist management.- launchd (macOS) and systemd user-unit templates under
deploy/for always-on hosting.
┌──────────────────────────────────────────────────────────────────────┐
│ hermit daemon (serve) │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ CLI/REPL │ │ Telegram │ │ WhatsApp │ │ GChat │ │
│ │ channel │ │ channel │ │ channel │ │ channel │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ long-poll │ ws to │ webhook │
│ │ │ /getUpdates │ baileys │ via tunnel │
│ │ │ │ bridge │ │
│ └────────┬──────┴───────┬───────┴───────┬───────┘ │
│ ▼ ▼ ▼ │
│ ┌───────────────────────────────────────┐ │
│ │ Inbound queue (asyncio.Queue) │ │
│ │ items: InboundMessage(channel,peer, │ │
│ │ text, attaches)│ │
│ └────────────────┬──────────────────────┘ │
│ ▼ │
│ ┌────────────────────────────────────┐ │
│ │ Router + Allowlist │ │
│ │ - drops unknown peers (pairing) │ │
│ │ - resolves (channel,peer) → sess │ │
│ └────────────────┬───────────────────┘ │
│ ▼ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Agent loop │ │
│ │ ┌──────────────┐ ┌─────────────┐ ┌───────────┐ │ │
│ │ │ system prompt│ │ OllamaClient│─▶│localhost: │ │ │
│ │ │ + MEMORY.md │ │ /api/chat │ │ 11434 │ │ │
│ │ └──────────────┘ └─────────────┘ └───────────┘ │ │
│ │ ┌──────────────┐ ┌────────────────────────────┐ │ │
│ │ │ ToolRegistry │ │ SessionStore (JSON) │ │ │
│ │ │ read/write/ │ │ ~/.hermit/sessions/*.json│ │ │
│ │ │ exec/fetch │ │ │ │ │
│ │ └──────────────┘ └────────────────────────────┘ │ │
│ └────────────────┬───────────────────────────────────┘ │
│ ▼ │
│ ┌────────────────────────────────────┐ │
│ │ Outbound dispatcher │ │
│ │ routes reply back to channel.send │ │
│ └────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
Telegram Baileys bridge Cloudflare Tunnel
/api.bot (Node sidecar, (public URL →
localhost:8788) localhost:8787)
│
▼
Google Chat events
One direction of dependency. Nothing imports the CLI; nothing in the model/tool/session layers imports each other. The daemon binds 127.0.0.1 only — never 0.0.0.0. The only inbound HTTP path is the GChat webhook, which arrives via tunnel.
hermit/
├── README.md
├── DESIGN.md # v1 build spec (local agent core)
├── DESIGN-v2.md # v2 build spec (channels delta)
├── LICENSE
├── pyproject.toml # uv / pip-installable
├── .env.example # commented env var template
├── hermit/ # package
│ ├── __init__.py
│ ├── __main__.py # `python -m hermit` entrypoint
│ ├── cli.py # click commands: chat, run, sessions, serve, allow, pair, doctor
│ ├── config.py # env loading, defaults
│ ├── agent.py # the agent loop
│ ├── ollama_client.py # HTTP client for /api/chat
│ ├── prompts.py # build_system_prompt(...)
│ ├── memory.py # MEMORY.md read/update helpers
│ ├── session.py # JSON-backed message log
│ ├── daemon.py # `hermit serve` orchestrator
│ ├── router.py # inbound-queue consumer + session map
│ ├── allowlist.py # allowlist + pairing flow
│ ├── confirm.py # pending_confirm state, parse y/n/always
│ ├── http_server.py # aiohttp app for webhooks + admin API
│ ├── tools/
│ │ ├── __init__.py # ToolRegistry, Tool dataclass
│ │ ├── filesystem.py # read_file, write_file
│ │ ├── shell.py # exec (confirm-gated)
│ │ └── web.py # fetch_url (optional, network-gated)
│ └── channels/
│ ├── __init__.py # Channel Protocol, InboundMessage
│ ├── cli.py # wraps existing REPL behind Channel
│ ├── telegram.py # python-telegram-bot, long polling
│ ├── whatsapp.py # HTTP/WS client to local bridge
│ └── gchat.py # webhook receiver + REST sender
├── tests/
│ ├── test_agent_loop.py
│ ├── test_tools.py
│ ├── test_ollama_client.py # mocked HTTP
│ ├── test_router.py
│ └── test_allowlist.py
├── deploy/
│ ├── launchd/ # macOS plist template
│ ├── systemd/ # Linux user-unit template
│ └── whatsapp-bridge/ # docker-compose for wuzapi, sample env
├── docs/
│ ├── telegram-setup.md # BotFather, token, group caveats
│ ├── whatsapp-bridge.md # bridge sidecar setup
│ └── gchat-setup.md # GCP + tunnel walk-through
└── workspace/ # default workspace dir (gitignored)
└── MEMORY.md # user-editable persistent memory
A complete walk-through from a fresh machine to a working hermit chat. Assumes Homebrew is installed.
brew install ollama
brew services start ollama # runs Ollama in the background; survives reboots
# Verify the daemon is up
curl -s http://localhost:11434/api/tags
# Pull a tool-calling-capable model (pull any open source model)
ollama pull gemma3:4b # ~3 GB, fastest
# ollama pull gemma3n:e4b # ~5 GB, "effective 4B"
# ollama pull qwen2.5:7b-instruct # ~5 GB, very reliable tool calls
ollama list # confirm the tag is local
gemma4:e4bis what shows up inenv.exampleas a default — if you don't have it, edit.envto point at whichever tagollama listshows.
git clone <your-fork-or-this-repo-url>
cd hermitThere's no requirements.txt — dependencies live in pyproject.toml. pip install -e . reads them from there.
python3.12 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]" # editable install; [dev] adds pytest, respxAfter this, the hermit binary is on your PATH (only while the venv is active).
cp env.example .env # Edit .env if you pulled a different model in step 1Defaults for everything else are fine for a first run.
hermit doctorExpected output: model name, Ollama ping ok, and a final model reply: 'ok'. If anything fails, fix that before continuing.
# One-shot
hermit run "summarize what hermit does in two sentences"
# Interactive REPL
hermit chat
# Inspect saved sessions
hermit sessions list
# Edit durable memory the agent reads on every turn
hermit memory editThat's the whole setup. Sessions live in ~/.hermit/sessions/*.json. Workspace files (the agent's read/write scratch space) live in ./workspace/ by default — override with --workspace /path or HERMIT_WORKSPACE=/path.
Copy .env.example to .env (cwd) or ~/.hermit/.env. Process env wins; cwd .env wins over ~/.hermit/.env.
# --- Required ---
AGENT_MODEL=gemma4:e4b
OLLAMA_HOST=http://localhost:11434
# --- Core ---
HERMIT_WORKSPACE=./workspace
HERMIT_STATE_DIR=~/.hermit
HERMIT_ALLOW_NETWORK=0 # 1 to enable fetch_url
HERMIT_MAX_STEPS=8 # tool-call iterations per turn cap
HERMIT_TIMEOUT_SEC=120 # per-request Ollama timeout
# --- Daemon ---
HERMIT_DAEMON_BIND=127.0.0.1:8787 # admin + webhooks. NEVER 0.0.0.0
HERMIT_LOG_LEVEL=INFO
HERMIT_PAIRING_TTL_SEC=600
# --- Safety ---
HERMIT_ALLOW_CHANNELS_TOOL_EXEC=0 # chat channels can't run `exec` unless 1hermit run "summarize TODO.md and suggest the next three things to ship"
hermit chat # new session, interactive REPL
hermit chat --session <id> # resume a specific session
hermit sessions list
hermit sessions show <id>
hermit sessions rm <id>hermit doctor # pings Ollama, validates each enabled channelIn your workspace directory:
MEMORY.md(user-editable) — durable preferences, past decisions, behavioral guidelines. Loaded into every system prompt. The agent may suggest edits; you execute them. Cap ~4KB before considering archival toMEMORY.archive.md.SOUL.md(optional) — tone/personality overlay. Cap ~1KB.
Both files are plain Markdown. No embeddings, no RAG, no summarization. Hermit reads them on every turn.