GitHub - fluid-chey/visiontree: VisionTree is a design-focused evolution of VoiceTree that helps designers capture visual intent, provide rich screenshot and screen-recording context, and guide AI agents toward faithful execution. It bridges human design judgment and machine output by structuring visual context — turning vision into actionable understanding.

Voicetree

Obsidian meets Claude Code

Voicetree is an interactive graph-view where nodes are either markdown notes, or terminal based agents (Claude code, Codex, OpenCode, Gemini etc. )

Agents can spawn their own subagents onto the graph. Agents will have the nearby nodes injected into their context. Agents are also able to edit and create their own nodes.

This project aims to build from first principles the most possibly efficient human-AI interaction system.

Why?

Challenge	Voicetree Solution
Manual agent coordination	Agents can breakdown tasks into subgraphs and recursively spawn children terminals
4-10 agent terminals is overwhelming	Spatially organise agents, tasks and progress on the graph
Agents don't know what you know	You share the same memory graph with agents
Agents suffer context-rot and lack memory	Defaults to short, focussed sessions with automatic handover

Install

Download links macOS (Apple Silicon) | macOS (Intel) | Windows | Linux

MacOS

brew tap voicetreelab/voicetree && brew install voicetree

Linux

curl -fsSL https://raw.githubusercontent.com/voicetreelab/voicetree/main/install.sh | sh

Windows: https://github.com/voicetreelab/voicetree/releases/latest/download/voicetree.exe

How It Works

Your agents (Claude Code, Codex, Opencode, Gemini etc.) live inside the graph, next to their tasks, plans, and progress updates.

Context retrieval: Agents see all nodes within a configurable radius and can semantic search against local embeddings.

Spatial layout: Location-based memory is the most efficient way to remember things.

Externalized working memory: Each node represents a concept at any level of abstraction. The graph structure mirrors your mental model - relationships between ideas are represented exactly as you think about them, offloading cognitive load to the canvas.

In Detail

Nodes are markdown files, connections are wikilinks to the .md file paths. You open rich markdown editors directly within the graph by hovering over a node, (or use speech-to-graph mode).

You can spawn coding agents on a node, the contents of that node will become the agents task, and it will also be given all context within an adjustable distance around them, and can semantic search against local embeddings. This means agents see what you see. You share the same memory, the same second brain. The graph structure allows for context retrieval to be targeted to only what is most relevant rather than dumping entire conversation history - avoiding the 30-60% performance degradation from context rot¹.

Agents can build their own subgraphs, decomposing their tasks into small connected chunks of work. You can glance at the high-level structure and progress of these, and zoom in to the details of what matters most. For example, ask a Voicetree agent to divide their plan into nodes of data-model, architecture, pure logic, edge logic, UI components, and integration. This lets you carefully track the planing to implementation for what matters most: the high level changes & core logic.

Agents can then spawn and orchestrate their own parallel subagents to work through these dependency graphs. In Voicetree, subagents are just native terminals so you have full transparency and control over them unlike with other CLI agents.

As your project & context grows, the Voicetree approach scales. You use your brains most efficient form of memory: remembering the location of where things are. Each node can represent any concept at any level of abstraction. You can see and reason about the structure between these concepts more easily as it is represented exactly as your brain represented them. This lets you externalise your working memory, freeing up cognitive load for the real problem-solving.

Voice Mode

Capture ideas hands-free with speech-to-graph.

Why speaking works: Speaking activates deliberate (System 2) thinking - verbalizing forces you to think about what you are doing. Japanese train conductors use "point and calling" (shisa kanko) to reduce errors by 85% for the same reason. Speech also engages different brain regions than writing, with lower cognitive load for idea generation. It's usually messy and hard to store/retrieve, so we turn voice into a structured mindmap.

Backtracking without mental load: Go arbitrarily deep down a problem. The graph holds the chain of "why am I doing this?" so you don't have to.

Tangibility: Thought becomes visible and persistent. This isn't just documentation; Making progress tangible is a prerequisite for flow states.

Development

Prerequisites: Node.js 18+, Python 3.13, uv

cd webapp && npm install && npm run electron  # App
uv sync && uv run pytest                               # Backend

License

BSL 1.1, converts to Apache 2.0 after 4 years. See LICENSE.

Contact

Questions? Join the Discord. Feedback is valuable - ping us with thoughts, criticisms, or feature requests.

Chroma Research, "Context Rot: How Increasing Input Tokens Impacts LLM Performance" (July 2025). 30-60% performance gaps between focused (~300 token) and full (~113k token) prompts. https://research.trychroma.com/context-rot ↩

Name		Name	Last commit message	Last commit date
Latest commit History 1,437 Commits
.claude		.claude
.github/workflows		.github/workflows
.playwright-mcp		.playwright-mcp
backend		backend
cloud_functions/agentic_workflows		cloud_functions/agentic_workflows
cross_tests		cross_tests
gsm_system		gsm_system
meta		meta
scripts		scripts
tools		tools
webapp		webapp
.coveragerc		.coveragerc
.gitignore		.gitignore
.obsidianignore		.obsidianignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
install.sh		install.sh
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
readme.md		readme.md
requirements-server.txt		requirements-server.txt
server.py		server.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voicetree

Why?

Install

How It Works

In Detail

Voice Mode

Development

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voicetree

Why?

Install

How It Works

In Detail

Voice Mode

Development

License

Contact

Footnotes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages