Ditto-LLM is a small Rust SDK that provides a unified interface for calling multiple LLM providers.
Goal: become a superset of LiteLLM Proxy + Vercel AI SDK Core via layering + Cargo feature gating.
See COMPARED_TO_LITELLM_AI_SDK.md and TODO.md for the parity notes and roadmap.
Layered product plan (L0/L1/L2):
- L0 (this repo): model adapters + protocol/shape conversion + direct SDK usage.
- L1 (this repo): gateway/proxy platform (API surface, routing, budgets, observability, admin).
- L2 (separate repo): enterprise closed-loop platform (prompt/eval/agent eval/org governance).
- Boundary: L2 depends on L1 contracts; L1 remains independently deployable for SMB/mid-market.
- Frozen L1 contract artifacts:
contracts/gateway-contract-v0.1.openapi.yaml+crates/ditto-gateway-contract-types.
Current scope:
- Default build: generic OpenAI-compatible LLM core (
provider-openai-compatible + cap-llm). This is the stable base and only default capability promise. - Unified LLM types + traits:
LanguageModel,Message/ContentPart,Tool,StreamChunk,Warning. - Text helpers:
generate_text/stream_text(AI SDK-stylegenerateText/streamText). - Structured outputs:
generate_object_json/stream_object(AI SDK-stylegenerateObject/streamObject). - Multi-modal inputs at the request shape level: images + PDF documents via
ContentPart::Image/ContentPart::File(provider support varies; unsupported parts emitWarning). - Parameter hygiene:
temperature/top_pare clamped to provider ranges; non-finite values are dropped (with warnings). - Default provider path: OpenAI-compatible Chat Completions (LiteLLM / DeepSeek / Qwen / OpenRouter / local gateways / etc.) with generate + SSE streaming + tools.
- Optional provider packs and capability packs add official OpenAI Responses, embeddings, images, audio, moderations, Google GenAI, Anthropic Messages, Cohere, Bedrock, Vertex, batches, rerank, and gateway translation surfaces.
- Provider profile config and model discovery (
ProviderConfig/GET /models) remain available for routing use-cases, but the default examples and docs now assume a generic OpenAI-compatible upstream.
Optional feature-gated modules:
- Agent tool loop:
ToolLoopAgent+ToolExecutor(featureagent). - Auth adapters: SigV4 signer + OAuth client-credentials flow (feature
auth). - Providers: Bedrock (SigV4) and Vertex (OAuth) adapters with generate + SSE streaming + tools (features
provider-bedrock,provider-vertex). - SDK utilities: stream protocol v1, HTTP adapters (SSE/NDJSON), telemetry sink, devtools JSONL logger, MCP tool adapter, cache middleware with streaming replay (feature
sdk). - SDK HTTP helpers: optional
axumresponse builders for stream adapters (featuresdk-axum). - Gateway control-plane: virtual keys, limits, cache, budget, routing, guardrails, passthrough, plus a
ditto-gatewayHTTP server (featuregateway). Includes LiteLLM-like conveniences such as/key/*endpoints,/a2a/*agent proxy, and/mcp*MCP tool gateway. - Gateway token counting: tiktoken-based input token estimation for proxy budgets/guardrails/costing (feature
gateway-tokenizer). - Gateway translation proxy: OpenAI-compatible
GET /v1/models,GET /v1/models/*,POST /v1/chat/completions,POST /v1/completions,POST /v1/responses,POST /v1/responses/compact,POST /v1/responses/input_tokens,GET /v1/responses/*,GET /v1/responses/*/input_items,DELETE /v1/responses/*,POST /v1/embeddings,POST /v1/moderations,POST /v1/images/generations,/v1/videos*(create/list/retrieve/delete/content/remix),POST /v1/audio/transcriptions,POST /v1/audio/translations,POST /v1/audio/speech,/v1/files*,POST /v1/rerank, and/v1/batchesbacked by Ditto providers (featuregateway-translation). - Gateway proxy caching: in-memory cache for non-streaming OpenAI-compatible responses (feature
gateway-proxy-cache). - Gateway OpenTelemetry: OTLP tracing exporter + structured logs for gateway HTTP requests (feature
gateway-otel).
Non-goals (for now):
- The default build is not an API gateway/proxy; the
gatewayfeature adds a lightweight control-plane + HTTP service. Thegateway-translationfeature adds translation forGET /v1/models,GET /v1/models/*,POST /v1/chat/completions,POST /v1/completions,POST /v1/responses,POST /v1/responses/compact,POST /v1/responses/input_tokens,GET /v1/responses/*,GET /v1/responses/*/input_items,DELETE /v1/responses/*,POST /v1/embeddings,POST /v1/moderations,POST /v1/images/generations,/v1/videos*,POST /v1/audio/transcriptions,POST /v1/audio/translations,POST /v1/audio/speech,/v1/files*,POST /v1/rerank, and/v1/batches. Full OpenAI surface translation (etc) is tracked inTODO.md. - Core helpers are single-step and return tool calls to the caller; the
agentfeature offers an opt-in tool loop, but it is not enabled by default. - It is not a full UI SDK (no frontend hooks or middleware ecosystem); the
sdkfeature only provides protocol/telemetry/devtools/MCP utilities. - Bedrock support targets Anthropic Messages-on-Bedrock; other Bedrock model families and Vertex service-account JWT flows are not covered yet.
See PROVIDERS.md for a pragmatic provider/capability matrix (native adapters + OpenAI-compatible
gateway coverage).
This repo includes an mdBook under docs/.
For the stable docs entrypoints, start with docs/README.md and docs/docs-system-map.md.
Use ./scripts/check-docs-system.sh to verify the repository-level docs skeleton.
cargo install mdbook
mdbook serve docsIf you don’t want to install mdBook, you can still read the Markdown directly in docs/src.
Ditto now documents provider integration around three separate axes:
- Default core:
provider-openai-compatible + cap-llmis the only out-of-the-box contract. - Provider packs:
provider-openai,provider-anthropic,provider-google,provider-cohere,provider-bedrock,provider-vertex, plus provider-specific packs such asprovider-deepseek,provider-kimi, andprovider-openrouter. - Capability packs:
cap-llm,cap-embedding,cap-image-generation,cap-image-edit,cap-audio-transcription,cap-audio-speech,cap-moderation,cap-rerank,cap-batch,cap-realtime.
The intended boundary is:
providerselects the runtime adapter/provider pack.ProviderConfigconfigures one concrete upstream node for that runtime.GenerateRequest.provider_optionsstays request-scoped.
See PROVIDERS.md for the provider × capability × feature × status table.
For Google function calling, Ditto-LLM converts tool parameter JSON Schema into an OpenAPI-style schema.
Contract:
- Conversion is best-effort and lossy: unsupported keywords are ignored (dropped), not errors.
- Unsupported keywords may emit
Warning::Compatibility(tool.parameters.unsupported_keywords)to avoid silent data loss. $refis best-effort: local refs (#/...) are resolved; unresolvable refs are ignored and aWarning::Compatibility(tool.parameters.$ref)is emitted.- Root empty-object schemas (no properties +
additionalPropertiesmissing/false) are treated as "no parameters" and omitted. - Boolean schemas (
true/false) are treated as unconstrained schemas; at the root they are omitted. - Nullable unions:
type: ["string", "null"]becomesanyOf: [{ "type": "string" }]+nullable: trueanyOf: [{...}, {"type":"null"}]becomes the same shape (single branch is flattened)
constbecomesenum: [<const>].additionalPropertiessupports boolean and nested schemas.
Supported keywords (subset): type, title, description, properties, required, items,
additionalProperties, enum, const, format, allOf, anyOf, oneOf, default,
minLength/maxLength/pattern, minItems/maxItems/uniqueItems,
minProperties/maxProperties, minimum/maximum/multipleOf,
and exclusiveMinimum/exclusiveMaximum (number form → minimum/maximum + exclusive* = true).
Default-core examples expect a generic OpenAI-compatible upstream:
export OPENAI_COMPAT_BASE_URL="https://your-openai-compatible-endpoint/v1"
export OPENAI_COMPAT_MODEL="your-chat-model"
export OPENAI_COMPAT_API_KEY="sk-..." # optional for local gateways that do not require auth
cargo run --example basic
cargo run --example streaming
cargo run --example tool_calling
cargo run --example openai_compatibleAdditional provider/capability examples stay opt-in:
cargo run --example openai_compatible_embeddings --features cap-embedding
cargo run --example embeddings --features "provider-openai cap-embedding"
cargo run --example multimodal --features "provider-openai cap-llm base64" -- ./image.png ./doc.pdf
cargo run --example batches --features "provider-openai-compatible cap-batch" -- ./requests.jsonlRun the HTTP gateway (feature gateway):
cargo run -p ditto-server --features gateway --bin ditto-gateway -- ./gateway.json --listen 0.0.0.0:8080YAML config is optional (feature gateway-config-yaml):
cargo run --features gateway-config-yaml --bin ditto-gateway -- ./gateway.yaml --listen 0.0.0.0:8080Optional admin UI asset (React; outside the default core build/CI path):
pnpm install
pnpm run dev:admin-uiMinimal multi-language gateway clients:
- Node (SSE streaming):
examples/clients/node/stream_chat_completions.mjs - Python:
examples/clients/python/chat_completions.py - Go:
examples/clients/go/chat_completions.go
Backends are configured in gateway.json (OpenAI-compatible upstreams + injected headers/query params, e.g. Authorization and Azure-style api-version):
{
"backends": [
{
"name": "primary",
"base_url": "https://api.openai.com/v1",
"max_in_flight": 64,
"timeout_seconds": 60,
"headers": { "authorization": "Bearer ${OPENAI_API_KEY}" },
"query_params": {}
}
],
"virtual_keys": [
{
"id": "local-dev",
"token": "${DITTO_VIRTUAL_KEY}",
"enabled": true,
"limits": {},
"budget": {},
"cache": {},
"guardrails": {},
"passthrough": {},
"route": null
}
],
"router": { "default_backends": [{ "backend": "primary", "weight": 1.0 }], "rules": [] }
}backends[].max_in_flight optionally caps concurrent in-flight proxy requests per backend (rejects with HTTP 429 + OpenAI-style error code inflight_limit_backend).
backends[].timeout_seconds optionally overrides the backend request timeout in seconds (default: 300s).
Gateway config supports ${ENV_VAR} interpolation in backend base_url/headers/query_params, backend provider_config node fields (for example base_url, default_model, http_headers, http_query_params, auth, upstream_api, normalize_to, normalize_endpoint), virtual_keys[].token, a2a_agents[] (agent url/headers/query), and mcp_servers[] (server url/headers/query) (expanded at startup via the process env or --dotenv).
Translation backends (feature gateway-translation) can be configured with provider + provider_config (same shape as ProviderConfig):
{
"backends": [
{
"name": "anthropic",
"provider": "anthropic",
"provider_config": {
"auth": { "type": "api_key_env", "keys": ["ANTHROPIC_API_KEY"] },
"default_model": "claude-3-5-sonnet-20241022"
}
}
],
"virtual_keys": [
{
"id": "local-dev",
"token": "${DITTO_VIRTUAL_KEY}",
"enabled": true,
"limits": {},
"budget": {},
"cache": {},
"guardrails": {},
"passthrough": {},
"route": null
}
],
"router": { "default_backends": [{ "backend": "anthropic", "weight": 1.0 }], "rules": [] }
}provider selects the runtime adapter; provider_config only provides the concrete upstream node settings for that adapter.
For OpenAI-compatible upstreams, provider can be openai-compatible/openai_compatible or a LiteLLM-style alias (e.g. groq, mistral, deepseek, qwen, together, fireworks, xai, perplexity, openrouter, ollama, azure).
Routing (optional):
router.default_backends: weighted primary selection (seeded byx-request-idwhen proxying)router.rules[].backends: per-model-prefix weighted backends (falls back torouter.default_backendswhen empty)- If multiple backends are selected, the OpenAI-compatible proxy will fall back to the next backend on network errors.
- With
--features gateway-routing-advanced, proxying can also use typed retry/fallback policies for status/network/timeout failures, circuit breaker controls, and active health checks (--proxy-retry*/--proxy-fallback-status-codes/--proxy-network-error-action/--proxy-timeout-error-action/--proxy-circuit-breaker*/--proxy-cb-failure-status-codes/--proxy-health-check*). - For non-safe HTTP methods, Ditto only continues to the next backend when the client explicitly supplies
x-request-id; otherwise a safety guard stops cross-backend retry/fallback to reduce duplicate side effects. - That guard is not a distributed dedup store. If you need true end-to-end idempotency, enforce it in the upstream application or add request-result dedup persistence at the gateway boundary.
Endpoints:
- OpenAI-compatible proxy (passthrough):
ANY /v1/*(e.g.POST /v1/responses,POST /v1/chat/completions,GET /v1/models).- LiteLLM-style aliases without a
/v1prefix are accepted (e.g./chat/completions,/embeddings,/moderations,/files/*,/batches/*,/models/*,/responses/*). - OpenAI-compatible
/v1/*, MCP/mcp*, and A2A/a2a/*surfaces are fail-closed: requests must include a configured virtual key viaAuthorization: Bearer <virtual_key>(orx-ditto-virtual-key/x-api-key). - The client
Authorizationheader is treated as a virtual key and is not forwarded upstream; the backendheadersare applied instead. - An empty
virtual_keysset means there are no valid client credentials yet, so those surfaces will return401until keys are provisioned. - If the upstream does not implement
POST /v1/responses(returns 404/405/501), Ditto will fall back toPOST /v1/chat/completionsand return a best-effort Responses-like response/stream (addsx-ditto-shim: responses_via_chat_completions).
- LiteLLM-style aliases without a
- OpenAI-compatible translation (feature
gateway-translation):GET /v1/models,GET /v1/models/*,POST /v1/chat/completions,POST /v1/completions,POST /v1/responses,POST /v1/responses/compact,POST /v1/responses/input_tokens,GET /v1/responses/*,GET /v1/responses/*/input_items,DELETE /v1/responses/*,POST /v1/embeddings,POST /v1/moderations,POST /v1/images/generations,/v1/videos*(create/list/retrieve/delete/content/remix),POST /v1/audio/transcriptions,POST /v1/audio/translations,POST /v1/audio/speech,/v1/files*,POST /v1/rerank, and/v1/batchescan be served by a backend withproviderconfigured (addsx-ditto-translation: <backend>;GET /v1/modelsonly lists translation models routable for the current virtual key/router path; translated/v1/responses/*retrieve/delete are best-effort, require gateway-scoped ids created by the same running gateway instance, and currently live in a bounded in-memory LRU store). - Control-plane demo endpoint:
POST /v1/gateway(JSONGatewayRequest; acceptsAuthorization: Bearer <virtual_key>). GET /healthGET /readyGET /metricsGET /admin/keys(admin token viaAuthorizationorx-admin-tokenif configured). Defaults to redacted tokens;?include_tokens=truerequires a write or tenant-write admin token and is rejected after keys have been reloaded from one-way hashed persistence.GET /admin/config/version,GET /admin/config/versions, andGET /admin/config/versions/:version_id(current/process-local history/detail for control-plane virtual-key config versions; restart rebuilds history from the loaded config as a new bootstrap snapshot; detail supports?include_tokens=truefor secret-managing admins only while original secrets are still in memory).GET /admin/config/diff(read-only or write admin token; compares two config versions viafrom_version_id+to_version_id;include_tokensrequires secret-managing admin access and is rejected once only hashed tokens remain).GET /admin/config/export(read-only or write admin token; exports current config by default, or a specific version viaversion_id;include_tokensrequires secret-managing admin access and is rejected once only hashed tokens remain).POST /admin/config/validate(read-only or write admin token; validatesvirtual_keysplus optionalrouterpayloads with optional expected hashes, without mutating runtime state).PUT /admin/config/router(write admin token required; updates router config with backend-reference validation and creates a new config version; supportsdry_run).- MCP tool gateway:
ANY /mcp*(JSON-RPCtools/list/tools/call+ convenience endpoints), and MCP tool integration forPOST /v1/chat/completionsandPOST /v1/responsesviatools: [{"type":"mcp", ...}](requires a valid virtual key). - A2A agent gateway:
GET /a2a/:agent_id/.well-known/agent-card.jsonandPOST /a2a/*JSON-RPC proxying (requiresa2a_agentsconfigured and a valid virtual key). POST /admin/keysandPUT|DELETE /admin/keys/:id(requires the write admin token).POST /admin/config/rollback(requires the write admin token; restores virtual keys and router to a previous config version; supportsdry_run).- LiteLLM-style key management (requires admin auth):
/key/generate,/key/update,/key/regenerate(or/key/:key/regenerate),/key/delete,/key/info,/key/list./key/listreturns key aliases by default;include_tokens=truerequires a write or tenant-write admin token and is rejected after keys have been reloaded from one-way hashed persistence./key/infoaccepts?key=...(admin query) or defaults to theAuthorization: Bearer <virtual_key>token when?keyis omitted (self lookup).
POST /admin/proxy_cache/purge(requires the write admin token and--proxy-cache; body can be{ \"cache_key\": \"...\" }or{ \"all\": true }).GET /admin/backendsandPOST /admin/backends/:name/reset(reset requires the write admin token and--features gateway-routing-advanced).
CLI options:
--listen HOST:PORT(or--addr HOST:PORT) sets the bind address (default:127.0.0.1:8080).--dotenv PATHloads a dotenv file (KEY=VALUE) for${ENV_VAR}interpolation and provider auth env lookups.--admin-token TOKENenables/admin/*endpoints (write admin token).--admin-token-env ENVloads the write admin token from env (works with--dotenv).--admin-read-token TOKENenables/admin/*read-only endpoints.--admin-read-token-env ENVloads the read-only admin token from env (works with--dotenv).--backend name=urladds/overrides a backend forPOST /v1/gateway(the backend is a URL that acceptsGatewayRequestJSON and returnsGatewayResponseJSON).--upstream name=base_urladds/overrides an OpenAI-compatible upstream backend (in addition togateway.json).--state PATHenables persistence for admin config mutations (virtual_keys+routerinGatewayStateFile; loaded on startup; created fromgateway.jsonwhen missing). Virtual-key tokens are written as one-waysha256:hashes.--sqlite PATHenables sqlite persistence for admin config mutations (virtual_keys+router; requires--features gateway-store-sqlite; loaded on startup). Virtual-key tokens are written as one-waysha256:hashes.--pg URL/--pg-env ENVenables postgres persistence for admin config mutations (virtual_keys+router) plus audit/budget/cost ledgers (/admin/audit*,/admin/budgets*,/admin/costs*; costs requiregateway-costing; requires--features gateway-store-postgres; loaded on startup). Virtual-key tokens are written as one-waysha256:hashes.--mysql URL/--mysql-env ENVenables mysql persistence for admin config mutations (virtual_keys+router) plus audit/budget/cost ledgers (/admin/audit*,/admin/budgets*,/admin/costs*; costs requiregateway-costing; requires--features gateway-store-mysql; loaded on startup). Virtual-key tokens are written as one-waysha256:hashes.--redis URLenables redis persistence for admin config mutations (virtual_keys+router; requires--features gateway-store-redis). Virtual-key tokens are written as one-waysha256:hashes.- After a restart from any persisted
sha256:state/store, Ditto can still authenticate presented virtual-key tokens, butinclude_tokens=trueexports can no longer return the original secret material. --redis-env ENVloads the redis URL from env (works with--dotenv; requires--features gateway-store-redis).--redis-prefix PREFIXsets the redis key prefix (requires--features gateway-store-redisand--redis/--redis-env).--audit-retention-secs SECSsets audit retention for sqlite/pg/mysql/redis stores (0disables retention; default is 30 days when any persistent store is configured).--db-doctorruns store schema checks and exits (startup also performs schema self-check and fails fast on mismatch).--json-logsemits JSON log records to stderr.--proxy-max-in-flight Nlimits concurrent in-flight proxy requests (rejects with 429 when exceeded). If omitted, default is256.--proxy-cacheenables a best-effort cache for non-streaming OpenAI-compatible responses (requires--features gateway-proxy-cache). When combined with--redis, responses are also cached in Redis (shared across instances).--proxy-cache-ttl SECSsets the proxy cache TTL (implies--proxy-cache).--proxy-cache-max-entries Nsets the in-memory proxy cache capacity (implies--proxy-cache).--proxy-cache-max-body-bytes Nsets the maximum cached body size per entry (implies--proxy-cache).--proxy-cache-max-total-body-bytes Nsets the in-memory total cached body budget (implies--proxy-cache).--proxy-retryenables retry on retryable statuses (requires--features gateway-routing-advanced).--proxy-retry-status-codes CODESoverrides retry status codes (comma-separated; implies--proxy-retry).--proxy-fallback-status-codes CODESfalls back to the next backend when a response status matches (comma-separated; works even when retry is disabled).--proxy-network-error-action ACTIONcontrols what to do on transport failures (none,fallback,retry; default:fallback).--proxy-timeout-error-action ACTIONcontrols what to do on backend timeouts (none,fallback,retry; default:fallback).- For non-safe methods (
POST/PUT/PATCH/DELETEand similar), cross-backend retry/fallback is guarded unless the client providedx-request-id; when blocked, Ditto emitsproxy.request_safety_guardin JSON/devtools logs. --proxy-retry-max-attempts Nsets max retry attempts (implies--proxy-retry).--proxy-circuit-breakerenables a simple circuit breaker (requires--features gateway-routing-advanced).--proxy-cb-failure-threshold Nsets circuit breaker failure threshold (implies--proxy-circuit-breaker).--proxy-cb-cooldown-secs SECSsets circuit breaker cooldown seconds (implies--proxy-circuit-breaker).--proxy-cb-failure-status-codes CODESadds extra status codes that should count toward the circuit breaker (for example408,429).--proxy-cb-no-network-errors,--proxy-cb-no-timeout-errors,--proxy-cb-no-server-errorsdisable individual circuit-breaker failure buckets.--proxy-health-checksenables active health checks (requires--features gateway-routing-advanced).--proxy-health-check-path PATHoverrides the health check request path (implies--proxy-health-checks; default:/v1/models).--proxy-health-check-interval-secs SECSsets health check interval seconds (implies--proxy-health-checks).--proxy-health-check-timeout-secs SECSsets health check timeout seconds (implies--proxy-health-checks).--pricing-litellm PATHloads LiteLLM-style pricing JSON for cost budgets (requires--features gateway-costing).--prometheus-metricsenables a Prometheus metrics endpoint (requires--features gateway-metrics-prometheus).--prometheus-max-key-series Nlimits per-key series cardinality (implies--prometheus-metrics).--prometheus-max-model-series Nlimits per-model series cardinality (implies--prometheus-metrics).--prometheus-max-backend-series Nlimits per-backend series cardinality (implies--prometheus-metrics).--prometheus-max-path-series Nlimits per-path series cardinality (implies--prometheus-metrics).--devtools PATHenables JSONL request/response logging (requires--features gateway-devtools).--otelenables OpenTelemetry tracing export via OTLP (requires--features gateway-otel).--otel-endpoint URLoverrides the OTLP endpoint (implies--otel).--otel-jsonenables JSON formatted tracing logs (implies--otel).
Response headers:
x-ditto-backend: which backend handled the requestx-ditto-request-id: request id (uses incomingx-request-idor generates one)x-ditto-cache:hitwhen served from the optional proxy cachex-ditto-cache-key: cache key for the optional proxy cache (when enabled and cacheable)x-ditto-cache-source:memoryorrediswhenx-ditto-cache=hitx-ditto-shim: present whenPOST /v1/responsesis shimmed viaPOST /v1/chat/completionsx-ditto-translation: present when a translation backend handled the request
If you want to consume a streaming response but still produce a final unified GenerateResponse,
use collect_stream:
use ditto_core::contracts::GenerateRequest;
use ditto_core::llm_core::model::LanguageModel;
use ditto_core::llm_core::stream::collect_stream;
let stream = llm.stream(GenerateRequest::from(messages)).await?;
let collected = collect_stream(stream).await?;
println!("{}", collected.response.text());Single-step text helpers (no tool execution loop):
use ditto_core::capabilities::text::LanguageModelTextExt;
use ditto_core::contracts::GenerateRequest;
let out = llm.generate_text(GenerateRequest::from(messages)).await?;
println!("{}", out.text);Streaming:
use futures_util::StreamExt;
use ditto_core::capabilities::text::LanguageModelTextExt;
use ditto_core::contracts::GenerateRequest;
let (handle, mut text_stream) = llm
.stream_text(GenerateRequest::from(messages))
.await?
.into_text_stream();
while let Some(delta) = text_stream.next().await {
print!("{}", delta?);
}
let final_text = handle.final_text()?.unwrap();
println!("\nfinal={final_text}");Use LanguageModelObjectExt to request structured output (AI SDK-style generateObject / streamObject).
Defaults (ObjectOptions::default()):
strategy = Auto:openai→ JSON Schema viaresponse_format(native)- other providers (incl.
openai-compatible) → tool-call enforced JSON (wraps output under{"value": ...}) - always falls back to extracting JSON from text if needed
output = Object(top-level object)
use ditto_core::capabilities::object::LanguageModelObjectExt;
use ditto_core::contracts::{GenerateRequest, Message};
use ditto_core::provider_options::JsonSchemaFormat;
use serde_json::json;
let schema = JsonSchemaFormat {
name: "recipe".to_string(),
schema: json!({ "type": "object" }),
strict: None,
};
let out = llm
.generate_object_json(GenerateRequest::from(vec![Message::user("hi")]), schema)
.await?;
println!("{}", out.object);Streaming (partial objects):
use futures_util::StreamExt;
let (handle, mut partial_object_stream) = llm
.stream_object(GenerateRequest::from(messages), schema)
.await?
.into_partial_stream();
while let Some(partial) = partial_object_stream.next().await {
println!("{:?}", partial?);
}
let final_obj = handle.final_json()?.unwrap();
println!("{final_obj}");Streaming arrays (AI SDK elementStream):
use ditto_core::capabilities::object::{ObjectOptions, ObjectOutput};
use futures_util::StreamExt;
let mut result = llm
.stream_object_with(
GenerateRequest::from(messages),
schema, // schema for a single element; ditto wraps it as {type:"array", items: ...}
ObjectOptions {
output: ObjectOutput::Array,
..ObjectOptions::default()
},
)
.await?;
while let Some(element) = result.element_stream.next().await {
println!("element = {}", element?);
}If you need an explicit abort handle (instead of relying on drop semantics), wrap the stream:
use ditto_core::contracts::GenerateRequest;
use ditto_core::llm_core::model::LanguageModel;
use ditto_core::llm_core::stream::abortable_stream;
let stream = llm.stream(GenerateRequest::from(messages)).await?;
let abortable = abortable_stream(stream);
abortable.handle.abort();EmbeddingModelExt provides AI SDK-style aliases:
use ditto_core::capabilities::EmbeddingModelExt;
let vectors = embeddings.embed_many(vec!["hello".to_string(), "world".to_string()]).await?;
let one = embeddings.embed_one("hi".to_string()).await?;Providers accept a custom reqwest::Client so you can configure timeouts, proxies, and default
headers (e.g. enterprise gateways):
let http = reqwest::Client::builder().build()?;
let llm = ditto_core::providers::OpenAI::new(api_key).with_http_client(http);When building providers from config, you can also set per-node default headers via
ProviderConfig.http_headers.
Providers apply their standard auth headers by default (OpenAI/OpenAI-compatible: bearer token;
Anthropic: x-api-key; Google: x-goog-api-key).
If you need a non-standard auth header (e.g. Azure / enterprise gateways), use:
auth = { type = "http_header_env", header = "api-key", keys = ["AZURE_OPENAI_API_KEY"] }If your gateway expects auth in a query param (e.g. ...?api_key=...), use:
auth = { type = "query_param_env", param = "api_key", keys = ["GATEWAY_API_KEY"] }If you need to fetch a token dynamically (e.g. gcloud auth print-access-token, aws-vault, Vault CLI), use:
auth = { type = "command", command = ["gcloud", "auth", "print-access-token"] }The command stdout may be a plain token, a JSON string ("sk-..."), or a JSON object with
api_key/token/access_token. Ditto enforces a 15s timeout (configurable via
DITTO_AUTH_COMMAND_TIMEOUT_MS/SECS) and a 64KiB stdout/stderr cap.
If your provider requires additional fixed query params on every request (e.g. Azure OpenAI
api-version), set ProviderConfig.http_query_params:
base_url = "https://{resource}.openai.azure.com/openai/deployments/{deployment}"
http_query_params = { "api-version" = "2024-02-01" }
auth = { type = "http_header_env", header = "api-key", keys = ["AZURE_OPENAI_API_KEY"] }Requests that support provider_options accept either:
- Legacy (flat): a single JSON object applied to the current provider.
- Bucketed: a JSON object keyed by provider id (optionally with a
"*"default bucket).
Bucketed example:
{
"provider_options": {
"*": { "parallel_tool_calls": false },
"openai": { "reasoning_effort": "high" },
"openai-compatible": { "response_format": { "type": "json_schema", "json_schema": { "name": "answer", "schema": { "type": "object" } } } }
}
}Precedence is "*" (base) → provider bucket (override). Provider ids are: openai,
openai-compatible (also accepts openai_compatible as an alias key), anthropic, google,
cohere, bedrock, vertex.
If you want to send PDFs via file_id (instead of inlining base64), OpenAI and OpenAI-compatible
providers expose a small upload helper:
let file_id = llm.upload_file("doc.pdf", pdf_bytes).await?;Enable repo-local git hooks:
git config core.hooksPath githooksThis enforces Conventional Commits and requires each commit to include CHANGELOG.md.
默认结构 gate 以 Rust 主线为准,目标是让“默认 core + all-features + no-default-features + provider feature matrix”都持续可构建、可 lint。对应的本地最小命令集:
cargo fmt --all -- --check
cargo run -p ditto-core --bin ditto-llms-txt -- --check
cargo check --workspace
cargo test --workspace --all-targets
cargo check -p ditto-core --examples
cargo clippy --workspace --all-targets --all-features -- -D warnings
cargo test --workspace --all-targets --all-features
cargo check -p ditto-core --no-default-features
cargo clippy -p ditto-core --no-default-features -- -D warnings
cargo check -p ditto-server --no-default-features
cargo clippy -p ditto-server --no-default-features -- -D warningsNode 默认只验证 packages/*:
pnpm run typecheck
pnpm run build可选 Admin UI 资产单独验证:
pnpm run typecheck:admin-ui
pnpm run build:admin-uiEnable the integration feature and set real credentials:
- OpenAI Responses:
OPENAI_API_KEY+OPENAI_MODEL - OpenAI-compatible:
OPENAI_COMPAT_BASE_URL+OPENAI_COMPAT_MODEL(+OPENAI_COMPAT_API_KEYoptional)
Then run:
cargo test --all-features