TraceWeft

TraceWeft is an open-source Rust-first observability and debugging toolkit for LLM agents. It captures model calls, tool calls, memory operations, retrievals, state transitions, checkpoints, handoffs, and errors as structured traces, then lets developers inspect, replay, diff, and export them through OpenTelemetry-compatible pipelines.

TraceWeft is local-first by default: run a Rust agent, open the local debugger, and inspect the full execution without sending prompts or tool outputs to a SaaS service.

Screenshots

The local trace workbench (trace-weft dev plus the web UI):

Span tree & inspector	Trace graph

Install

TraceWeft is published on crates.io. Add it to your project:

[dependencies]
trace-weft = "0.3"

The SDK is sqlite-by-default (a SQLite mirror alongside the JSONL stream). For a pure local-JSONL integrator that pulls no sqlx:

[dependencies]
trace-weft = { version = "0.3", default-features = false }

Requires Rust 1.96+ (edition 2024). The trace-weft CLI is not published to crates.io; install it from the repository:

cargo install --git https://github.com/kidoz/trace-weft trace-weft-cli

Quickstart

Initialize a recorder once at startup, then record spans. The recorder is a process-wide singleton; init_local wires the JSONL + SQLite recorder and the blob store for content capture.

use trace_weft::{CapturePolicy, LocalConfig, init_local};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    init_local(LocalConfig {
        database_path: "./.trace-weft/traces.jsonl".into(),
        sqlite_db_path: "./.trace-weft/traces.sqlite".into(),
        blob_dir: "./.trace-weft/blobs".into(),
        capture_content: CapturePolicy::RedactedPreview,
    })
    .await?;

    let answer = answer_question("why is the sky blue?".into()).await?;
    println!("{answer}");
    Ok(())
}

For tests and evaluation, swap in an in-memory store:

use std::sync::Arc;
use trace_weft::{eval::MemoryStore, init_custom};

let store = MemoryStore::new();
init_custom(Arc::new(store.clone()))?;
// ... run the agent, then assert over store.spans / store.events

To compile tracing in but disable it at runtime, use the no-op store: init_custom(Arc::new(trace_weft::NullStore)).

The builder — the primary API

SpanBuilder is the capable, imperative recorder. Build a span, set the rich fields you have on hand, and wrap the work in .run(...):

use trace_weft::{build_llm_call, CostEstimate, TokenUsage};

let answer = build_llm_call("chat_completion")
    .provider("anthropic")
    .model("claude-fable-5")
    .prompt_version("v3")
    .input_ref("prompt", &prompt)
    .token_usage(TokenUsage { input: 1200, output: 280, reasoning: None, breakdown: Default::default() })
    .cost(CostEstimate { currency: "USD".into(), amount: 0.012 })
    .cache_hit(false)
    .attribute("temperature", serde_json::json!(0.2))
    .run(|| async {
        // the real call
        client.complete(&prompt).await
    })
    .await?;

What `.run` captures

SpanBuilder::run(f) executes the closure and records:

latency — start_time/end_time/latency_ms around the closure;
status — Ok on Ok(_), Error on Err(e) (with error_type from Debug and a redacted message from Display);
parenting — auto-links to the ambient span context (see below) unless you set one explicitly with .with_parent(...);
replay — short-circuits to a mocked value when replay is configured.

It does not infer inputs/outputs by itself — set .input_ref(label, &value) or .output_ref(label, &value) for values you want captured under the active CapturePolicy (or use the macros, which capture arguments and successful returns for you). If you already have a content-addressed blob, use .input_blob_ref(blob_ref) / .output_blob_ref(blob_ref). For closures that don't return Result, use .run_infallible(|| async { ... }).

Builder setters cover the high-value SpanRecord fields: .provider(), .model(), .prompt_version(), .tool_name(), .input_ref(), .output_ref(), .token_usage(), .cost(), .cache_hit(), .retrieval(query_hash, doc_refs), .attribute(k, v), .attributes(map).

Response-derived fields: `run_with`

Token usage, cost, and cache status usually only exist on the provider's response — after the builder's setters have run. run_with passes the closure a SpanHandle; whatever the closure sets on it is merged into the span when the closure finishes (on success and error), taking precedence over the builder's values:

let response = build_llm_call("chat_completion")
    .provider("openrouter")
    .model(&model)
    .run_with(|span| async move {
        let response = client.complete(&request).await?;
        span.token_usage(TokenUsage {
            input: response.usage.prompt_tokens,
            output: response.usage.completion_tokens,
            reasoning: None,
            breakdown: Default::default(),
        });
        span.cost(CostEstimate { currency: "USD".into(), amount: response.usage.cost });
        Ok::<_, anyhow::Error>(response)
    })
    .await?;

See examples/openrouter-agent for this pattern against a live provider.

Wrapping an `LlmClient` / `Tool`

Wrap each trait method body in a builder call — the span kind matches the role:

#[async_trait::async_trait]
impl LlmClient for AnthropicClient {
    async fn complete(&self, req: Request) -> anyhow::Result<Response> {
        trace_weft::build_llm_call("complete")
            .provider("anthropic")
            .model(&req.model)
            .run(|| async { self.inner.complete(req).await })
            .await
    }
}

#[async_trait::async_trait]
impl Tool for KbSearch {
    async fn call(&self, query: String) -> anyhow::Result<Vec<Doc>> {
        trace_weft::build_tool("kb_search")
            .tool_name("kb_search")
            .run(|| async { self.search(query).await })
            .await
    }
}

Macros

#[agent], #[tool], and #[llm_call] instrument a function (free function or impl/trait-impl method, including &self) and record a span with the matching SpanKind. They:

set the correct span kind and Error status when the body returns Err;
auto-link to the ambient span context, so nested instrumented calls form a tree with no manual ID threading;
capture inputs/outputs under the configured CapturePolicy — every captured argument must be Serialize; opt an argument out with #[trace(skip)].

use trace_weft::{agent, llm_call, tool};

struct ResearchAgent { client: AnthropicClient }

impl ResearchAgent {
    #[agent]
    async fn run(&self, question: String) -> anyhow::Result<String> {
        let plan = self.plan(&question).await?;       // child span, auto-parented
        self.draft(plan).await
    }

    #[llm_call]
    async fn plan(&self, question: &str, #[trace(skip)] api_key: &str) -> anyhow::Result<Plan> {
        // `question` is captured to input_ref; `api_key` is skipped
        self.client.plan(question, api_key).await
    }
}

Trait definitions carry no body, so annotate the concrete impl. The instrumented function must be async.

Capture policy

CapturePolicy (set via LocalConfig.capture_content) governs content capture:

Policy	Behavior
`MetadataOnly`	No content captured (zero serialization cost).
`RedactedPreview`	Redacts content, stores it, sets a redacted preview.
`FullContentLocalOnly` / `FullContentExportable`	Stores full content while keeping the preview redacted.

Captured content is hashed, written to the blob store, and referenced from the span as input_ref / output_ref.

Events

Spans have duration; events are point-in-time occurrences inside a span (a retry, a budget check, a guardrail trip, an REPL step). Build one with event and .record() it — it auto-links to the ambient span and gets a monotonic ordering seq:

use trace_weft::{event, EventKind};

event(EventKind::Budget, "budget_check")
    .attribute("tokens_remaining", serde_json::json!(1500))
    .record()
    .await;

EventKind: LlmCall, ToolCall, ReplExec, Rpc, Budget, Guardrail, Retry, Termination, Log, Custom.

Ambient context

SpanBuilder::run and the macros install the current span as a task-local parent for the duration of the body. Child spans and events created inside it link automatically; with_parent(trace_id, run_id, span_id) is the explicit override for cross-task / cross-thread handoffs. Construct IDs with TraceId::new(), SpanId::new(), RunId::new() — no direct uuid dependency required.

Human-in-the-loop breakpoints

SpanBuilder::wait_for_approval() records the span as PendingApproval, blocks until the UI/server approves or rejects, then resumes — a debugger-style breakpoint for risky actions, and a differentiator over plain OpenTelemetry:

match build_tool("transfer_funds").wait_for_approval().await? {
    trace_weft::HitlResponse::Approved(args) => execute(args).await,
    trace_weft::HitlResponse::Rejected(reason) => bail!("rejected: {reason}"),
}

Local Dev Workflow

Once your application produces traces into .trace-weft/traces.sqlite, inspect them visually:

trace-weft dev          # starts the local axum API on :3000 (local-first)

trace-weft dev starts only the API server (port 3000 by default, local-first auth). To view the React UI in a browser, run it against that API:

npm --prefix apps/web install
npm --prefix apps/web run dev   # Vite dev server on :5173, proxies /api → :3000

Then open http://localhost:5173 for the Trace List, Span Tree, Waterfall, and Replay/Diff UI. Alternatively, the desktop app (apps/desktop) bundles the built UI and embeds the API server in one window.

The UI's API base is import.meta.env.VITE_API_BASE (empty by default → same -origin /api, which the Vite dev server proxies); the desktop build sets it to the embedded server's http://127.0.0.1:3000.

Where traces are written

The paths in LocalConfig are resolved by the traced application's process: a relative path like ./.trace-weft/traces.jsonl lands in whatever directory the agent runs from, so each project keeps its own isolated .trace-weft/ store. There is no global store.

To inspect traces produced by an agent in another project:

Run the workbench from that project's root:
```
cd ~/projects/my-bot && trace-weft dev
```
The default --db-path (./.trace-weft/traces.sqlite) resolves against the current working directory.
Point at the database explicitly:
```
trace-weft dev --db-path ~/projects/my-bot/.trace-weft/traces.sqlite
```
The blob directory defaults to blobs next to the database, so captured content (prompts, tool outputs) resolves without further flags. If the blobs live elsewhere, pass --blob-dir /path/to/blobs.

Server: storage backends and authentication

The server query endpoints (/api/traces, /api/traces/{id}, /api/evals) run against either backend, selected by the database URL:

SQLite (default, local-first) — any path or sqlite:// URL.
Postgres — a postgres:// / postgresql:// URL. The schema is created on first connect and the same endpoints serve it with output matching SQLite. A local instance for development and the Postgres-backed tests is provided via docker compose up -d postgres; run the parity suite with just test-pg.

API-key authentication and per-project tenant isolation are configured via environment variables:

TRACE_WEFT_API_KEYS — comma-separated raw_key:project_id pairs. Keys are hashed (SHA-256) at startup and never stored in the clear; requests authenticate with Authorization: Bearer <raw_key>. Trace queries are scoped to the key's project, and ingested spans are stamped with it server-side.
TRACE_WEFT_DEV_MODE=1 — enable the dev bypass (no key required; queries span all tenants). The embedded local server (trace-weft dev, desktop app) defaults this on when no keys are configured; production start_server use defaults it off, rejecting unauthenticated requests with 401.

OTLP/HTTP JSON ingestion is served at POST /v1/traces: payloads are decoded by the trace-weft-ingest crate (opentelemetry-proto types, preserving original trace/span/parent IDs, 400 for malformed bodies), then — like /api/v1/batch — the authenticated project is stamped onto every span before it is persisted. A 400 is returned for malformed bodies.

Development Checks

Workspace policy, API contract generation, CI checks, and UI test commands are documented in DEVELOPMENT.md. The short version:

cargo fmt --all -- --check
cargo clippy --workspace --all-targets --all-features -- -D warnings
cargo nextest run --workspace --all-features
cargo deny check
cargo machete
npm --prefix apps/web run generate:api
npm --prefix apps/web run lint
npm --prefix apps/web run test
npm --prefix apps/web run build

Crate Layout

crates/trace-weft - main user-facing SDK facade (builder, macros, events, capture, HITL, replay)
crates/trace-weft-core - IDs, schemas, span/event types, redaction traits
crates/trace-weft-macros - proc macros: #[agent], #[tool], #[llm_call]
crates/trace-weft-otel - OpenTelemetry export/import bridge
crates/trace-weft-openinference - OpenInference compatibility mapping
crates/trace-weft-recorder - local JSONL/SQLite/blob recorder (sqlite feature, on by default)
crates/trace-weft-ingest - OTLP ingestion via opentelemetry-proto, preserving original IDs
crates/trace-weft-server - axum API, SQLite + Postgres query layer, API-key auth and tenant scoping
crates/trace-weft-cli - CLI: dev (starts the local API server); import/export/replay are planned
apps/web - React / TypeScript / Vite UI

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
.github/workflows		.github/workflows
apps		apps
crates		crates
docs/screenshots		docs/screenshots
examples		examples
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
README.md		README.md
deny.toml		deny.toml
docker-compose.yml		docker-compose.yml
justfile		justfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TraceWeft

Screenshots

Install

Quickstart

The builder — the primary API

What `.run` captures

Response-derived fields: `run_with`

Wrapping an `LlmClient` / `Tool`

Macros

Capture policy

Events

Ambient context

Human-in-the-loop breakpoints

Local Dev Workflow

Where traces are written

Server: storage backends and authentication

Development Checks

Crate Layout

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TraceWeft

Screenshots

Install

Quickstart

The builder — the primary API

What .run captures

Response-derived fields: run_with

Wrapping an LlmClient / Tool

Macros

Capture policy

Events

Ambient context

Human-in-the-loop breakpoints

Local Dev Workflow

Where traces are written

Server: storage backends and authentication

Development Checks

Crate Layout

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

What `.run` captures

Response-derived fields: `run_with`

Wrapping an `LlmClient` / `Tool`

Packages