datoon is a pragmatic smart TOON gateway for LLM data workloads: it converts JSON payloads to TOON only when conversion is likely to improve token efficiency.
Raw JSON is often verbose in prompts. TOON can save tokens, but blind conversion can also make payloads worse. datoon adds a decision layer so pipelines can convert when savings are meaningful, skip when structure is a poor TOON fit, and always report exactly why the decision was made.
Before: JSON payload in the prompt
{"users":[{"id":1,"name":"Ada","role":"admin"},{"id":2,"name":"Lin","role":"analyst"},{"id":3,"name":"Grace","role":"viewer"}]}After: datoon converts only because policy says it should
users[3]{id,name,role}:
1,Ada,admin
2,Lin,analyst
3,Grace,viewer
{
"decision": "convert",
"reason": "Estimated savings 44.19% (threshold 15.00%).",
"input_token_estimate": 43,
"output_token_estimate": 24
}Same records, same keys, smaller prompt payload. If the structure is non-uniform, tiny, deeply nested, or conversion saves too little, datoon keeps JSON instead.
Auto mode returns compact JSON instead of TOON when the payload is not a good fit: no uniform object arrays, fewer than the configured minimum rows, nesting deeper than --max-depth, estimated savings below --min-savings, or a missing/unavailable TOON CLI dependency. Invalid JSON still fails loudly, and --force bypasses gating for experiments but does not hide conversion failures.
- Setup
- Install
- Quick Start
- Python API
- MCP Server
- Claude Code Plugin
- CLI Reference
- Benchmarks
- Development
- Docs
One-command setup β installs all dependencies and registers datoon globally so it's available in any terminal:
./setup.shWhat it does:
- Checks Python 3.12+ (fails fast if missing)
- Installs
uvif not present - Warns if Node.js is missing (required for TOON conversion)
- Runs
uv sync --extra dev - Registers the
datoonCLI globally viauv tool install --editable . - Adds
~/.local/binto your shell profile if needed
After setup:
datoon --help
echo '{"users":[{"id":1,"name":"Ada"}]}' | datoon --report-stdoutFull install options for Python, CLI, MCP, Claude Code, and Codex live in INSTALL.md.
For integration installs, preview first with ./install.sh --dry-run, then use ./install.sh --install --target <claude|codex|mcp>.
The installer only changes selected local integration config: Claude Code plugin state through the claude CLI, Codex marketplace JSON, and/or an MCP JSON config. See INSTALL.md for exact paths, backups, privacy notes, and troubleshooting.
# via uv (recommended)
uv add datoon
# via pip
pip install datoon
# with optional tiktoken-based token counting
pip install "datoon[tokens]"
# with MCP server support
pip install "datoon[mcp]"Requires Python 3.12+. TOON conversion requires Node.js with npx in PATH.
datoon itself processes payloads locally. The Python package has no required runtime dependencies; optional extras add token estimation (tiktoken) or MCP server support (mcp). The converter invokes npx --yes @toon-format/cli@2 only when auto policy reaches the conversion step or when --force is used.
stdin:
echo '{"users":[{"id":1,"name":"Ada"},{"id":2,"name":"Lin"}]}' | datoon --report-stdoutfile:
datoon ./input.json -o ./output.toon --report ./report.jsonforce conversion:
datoon ./input.json --force --report-stdoutfrom datoon import convert_json_for_llm, ConversionConfig, DatoonError
config = ConversionConfig(
min_savings_ratio=0.15, # skip if savings below 15%
max_depth=6, # skip if nesting deeper than 6
min_uniform_rows=3, # require at least 3 uniform rows
toon_cli_timeout=30, # seconds before CLI call is aborted
force=False,
)
try:
outcome = convert_json_for_llm(raw_json, config)
except DatoonError as exc:
print(f"conversion failed: {exc}")
raise
# outcome.payload_text β TOON or original JSON depending on decision
# outcome.report.decision β "convert" | "skip"
# outcome.report.reason β human-readable explanation
# outcome.report.savings_ratio β float, e.g. 0.281
send_to_model(outcome.payload_text)Structure-only analysis (no Node.js required):
from datoon.analyzer import analyze_payload
from datoon.models import ConversionConfig
analysis = analyze_payload(parsed_data, ConversionConfig())
print(analysis.is_candidate, analysis.reason)datoon ships an MCP server with two tools:
| Tool | Description |
|---|---|
convert_json |
Full conversion with policy gating |
analyze_json |
Structure analysis only β no Node.js needed |
Run locally:
datoon-mcpClaude Desktop / Cursor / Windsurf config:
{
"mcpServers": {
"datoon": {
"command": "uvx",
"args": ["datoon[mcp]", "datoon-mcp"]
}
}
}Install directly from GitHub:
claude plugin marketplace add andrii-su/datoon
claude plugin install datoon@datoonTrigger in-session:
/datoon
convert this JSON to TOON if it saves tokens
use datoon mode for structured data
| Flag | Default | Description |
|---|---|---|
--force |
false |
Bypass gating and minimum savings threshold |
--min-savings |
0.15 |
Minimum relative token savings required |
--max-depth |
6 |
Maximum nesting depth for auto-conversion |
--min-uniform-rows |
3 |
Minimum rows in uniform object arrays |
--timeout |
30 |
Seconds before TOON CLI call is aborted |
--report <path> |
β | Write JSON conversion report to file |
--report-stdout |
β | Print JSON conversion report to stderr |
-o <path> |
stdout | Output file path |
PYTHONPATH=src python benchmarks/run.py --dry-run
PYTHONPATH=src python benchmarks/run.py
PYTHONPATH=src python benchmarks/run.py --update-readme
python scripts/summarize_agent_skill_eval.pyAuto mode avoids low-benefit and high-risk payloads (orders-nested, mixed-non-uniform) while matching forced TOON's average token count on suitable ones. Every decision comes with a reasoned report.
| Scenario | JSON Baseline | Forced TOON | datoon Auto |
|---|---|---|---|
| Average tokens | 77 | 50 | 50 |
| Avg token saved | 0.0% | 26.8% | 28.1% |
| Decision quality | n/a | Converts all | Converts 3/5, skips harmful cases |
| Dataset | JSON | TOON (forced) | Raw Saved | Auto | Auto Tokens | Auto Saved |
|---|---|---|---|---|---|---|
| users-small | 42 | 23 | 45.2% | convert | 23 | 45.2% |
| events-medium | 148 | 84 | 43.2% | convert | 84 | 43.2% |
| orders-nested | 70 | 69 | 1.4% | skip | 70 | 0.0% |
| mixed-non-uniform | 26 | 28 | -7.7% | skip | 26 | 0.0% |
| metrics-wide | 100 | 48 | 52.0% | convert | 48 | 52.0% |
| Average | 77 | 50 | 26.8% | 3/5 convert | 50 | 28.1% |
Forced conversion succeeded for 5/5 payloads.
We also ran an artifact-based subagent comparison with identical analysis tasks in two modes:
with_skill: agent received thedatoonskill and followed the conversion workflow.without_skill: agent used the JSON payload directly and was instructed not to use TOON ordatoon.
The test covered 3 payload sizes (small, medium, large) across 3 deterministic iterations, for 18 total agent runs. Both modes produced exact expected answers in every run.
| Scenario | Avg JSON Tokens | Avg TOON Tokens | Avg Payload Saved |
|---|---|---|---|
| small | 225.33 | 118.00 | 47.63% |
| medium | 2,972.00 | 1,138.00 | 61.71% |
| large | 17,757.00 | 6,673.00 | 62.42% |
| Mode | Runs | Correct | Accuracy |
|---|---|---|---|
| with skill | 9 | 9 | 100% |
| without skill | 9 | 9 | 100% |
Full report and raw outputs live in benchmarks/agent_skill_eval/. The subagent tool did not expose full model token usage for each run, so this report claims payload-token savings only, not total end-to-end model-token savings.
Contributor workflow is documented in CONTRIBUTING.md. Maintainer/source-of-truth notes for agents live in CLAUDE.md.
Setup:
uv sync --extra dev
uvx pre-commit installTests:
# unit only
pytest -m "not integration"
# with integration (requires Node.js + npx)
pytestPre-commit:
uvx pre-commit run --all-filesSkill sync check:
python scripts/validate_skill_sync.py
python scripts/validate_plugin_metadata.py- Live site: andrii-su.github.io/datoon
- Source:
docs/
See SECURITY.md for vulnerability reporting and response policy.