Skip to content

Cost summary silently undercounts unpriced models; pricing should be a provider hook, not just a static table #265

Description

@brrusino

Summary

conductor run reported a workflow total of $0.45 when real usage was ~$2.0 (4x undercount). Two independent issues, both provider-agnostic.

1. total_cost_usd silently drops unpriced agents (provider-independent)

engine/usage.py:

costs = [a.cost_usd for a in self.agents if a.cost_usd is not None]
return sum(costs) if costs else None

When a model isn't priced, calculate_cost returns None (correct), but the summary then sums only the priced subset and presents it as the total, with no signal that agents were excluded. In my run, 3 of 5 agents (the two heaviest) were unpriced -> 75% of spend silently vanished. A partial that looks complete is worse than "unavailable".

Fix: when any agent is unpriced, surface it -- return the partial plus an unpriced_agents/unpriced_models flag, and have the CLI/dashboard render e.g. ~$X (N agents unpriced) instead of a clean-looking wrong number.

2. Pricing is a single static table that goes stale; it should be a provider-supplied hook with the table as fallback

DEFAULT_PRICING (engine/pricing.py) is hardcoded and lags new releases (claude-opus-4.8, gpt-5.5 absent), and get_pricing's matcher correctly refuses cross-family inheritance (#137) -> None.

Each provider knows its own pricing far better than a shared static dict, and they differ:

  • copilot already fetches list_models(), whose entries carry a billing object (per-model credit cost / premium multiplier). It is used today only for get_max_prompt_tokens + reasoning-effort validation -- the authoritative, always-current pricing is already in hand and discarded for costing.
  • claude / claude-agent-sdk: the Anthropic API's models.list() exposes no pricing, so these legitimately fall back to a static table.

So a copilot-only "read billing" fix would be wrong -- it would leave claude/openai users on the same stale table.

Fix: add a provider hook mirroring the existing AgentProvider.get_max_prompt_tokens precedent (providers/base.py:273) -- e.g. async def get_model_pricing(model) -> ModelPricing | None, returning None when unavailable (never blocking). Resolution order in get_pricing becomes: workflow cost.pricing override -> provider hook -> DEFAULT_PRICING -> None.

  • copilot impl: derive from list_models().billing (credits -> USD; observed 100 credits = $1).
  • claude/openai impl: return None (fall back to the static table) until/unless they expose a source.

Workaround today

Workflow-level cost.pricing (WorkflowDef.cost) lets users hand-supply rates -- but it is per-workflow boilerplate re-derived on every model release, and because of (1) a user who does not know to add it gets a silently-wrong total with no warning.

Repro

Any workflow pinning a model newer than DEFAULT_PRICING (e.g. claude-opus-4.8); the end-of-run summary excludes it silently.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions