Summary
The end-of-run Token Usage Summary / Cost Breakdown under-reports badly for any workflow that uses type: workflow sub-workflow steps, because a child sub-workflow's UsageTracker is never merged back into the parent. The printed total only reflects agents the root engine recorded directly; every sub-workflow agent is invisible.
On a real run with 3 sub-workflow agents, the summary printed 116,529 input tokens / $0.43 when the actual figure (summed from the agent_completed events) was 462,844 input tokens — the analyze (119k), fix_code (135k), and checker (92k) agents were all missing from the aggregate.
A second, independent bug compounds it: two current models are absent from DEFAULT_PRICING, so even where their tokens are counted they're costed at $0.
Version: conductor-cli v0.1.19.
Bug 1 — sub-workflow usage is never rolled into the parent
type: workflow steps run in a child WorkflowEngine with its own UsageTracker. The child is run via _run_child_engine, which both sub-workflow entry points funnel through:
_execute_subworkflow (≈ line 1367) — return await self._run_child_engine(child_engine, sub_inputs, agent). Returns only the output; child_engine.usage_tracker is dropped.
_execute_subworkflow_with_inputs (≈ line 1480) — captures usage = child_engine.usage_tracker.get_summary() but the for-each caller (≈ line 5073) uses it only for a for_each_item_completed event payload, never merging it into self.usage_tracker.
The final summary is built solely from the root engine's tracker:
# get_execution_summary(), ≈ line 5577
usage = self.usage_tracker.get_summary()
summary["usage"] = { "total_input_tokens": usage.total_input_tokens, ... }
Since no code path ever extends the parent tracker with child records, every sub-workflow agent is omitted from the printed total. (The live --web dashboard looks roughly right because each child engine shares the parent's event_emitter and emits its own agent_completed events — only the aggregate is wrong, which makes this easy to misdiagnose.)
Suggested fix
_run_child_engine is the single chokepoint both paths use, and it already receives the child engine. Merging there is enough and is transitive (a grandchild merges into its child, then that child — now carrying the grandchild's records — into the parent), with no double counting because the parent never records sub-workflow agents itself:
async def _run_child_engine(self, child_engine, sub_inputs, agent):
try:
return await self._orig_run_child_engine(child_engine, sub_inputs, agent)
finally:
# Roll the sub-workflow's per-agent usage into the parent so the
# final summary + cost breakdown include sub-workflow agents.
self.usage_tracker._agents.extend(
child_engine.usage_tracker.get_summary().agents
)
(Using finally also costs an expensive child that ultimately fails.)
Bug 2 — DEFAULT_PRICING is missing current models
conductor/engine/pricing.py DEFAULT_PRICING has no entry for claude-opus-4.8 or gpt-5.5. get_pricing returns None for both (the --delimited versioned-suffix fallback doesn't match either), so calculate_cost returns None and those agents are costed at $0. Worth noting the fuzzy fallback can also mis-price: a name like claude-opus-4.8 would match the older claude-opus-4 entry ($15/$75 per Mtok) if it were claude-opus-4-..., i.e. ~3× the real Opus 4.x rate — so silent fuzzy matching across model families is a sharp edge in its own right.
Suggested fix
Add current entries (and consider sourcing this table so it doesn't go stale):
"claude-opus-4.8": ModelPricing(input_per_mtok=5.00, output_per_mtok=25.00,
cache_read_per_mtok=0.50, cache_write_per_mtok=6.25),
"gpt-5.5": ModelPricing(input_per_mtok=2.00, output_per_mtok=8.00),
Repro
- Run any workflow with a
type: workflow step whose sub-workflow invokes at least one agent.
- Compare the printed "Token Usage Summary" totals against the sum of
total_input_tokens / cost_usd across all agent_completed events (e.g. from GET /api/state on the --web dashboard, or the --web logs export).
- The summary totals exclude every sub-workflow agent.
Workaround
For anyone hitting this before a fix lands: both can be fixed in-process without editing site-packages, via a sitecustomize.py on PYTHONPATH that (a) adds the missing DEFAULT_PRICING entries in place and (b) wraps WorkflowEngine._run_child_engine with the finally-merge above. Happy to send a PR if a maintainer confirms the chokepoint-merge approach is the preferred shape.
Summary
The end-of-run Token Usage Summary / Cost Breakdown under-reports badly for any workflow that uses
type: workflowsub-workflow steps, because a child sub-workflow'sUsageTrackeris never merged back into the parent. The printed total only reflects agents the root engine recorded directly; every sub-workflow agent is invisible.On a real run with 3 sub-workflow agents, the summary printed 116,529 input tokens / $0.43 when the actual figure (summed from the
agent_completedevents) was 462,844 input tokens — theanalyze(119k),fix_code(135k), andchecker(92k) agents were all missing from the aggregate.A second, independent bug compounds it: two current models are absent from
DEFAULT_PRICING, so even where their tokens are counted they're costed at$0.Version:
conductor-cli v0.1.19.Bug 1 — sub-workflow usage is never rolled into the parent
type: workflowsteps run in a childWorkflowEnginewith its ownUsageTracker. The child is run via_run_child_engine, which both sub-workflow entry points funnel through:_execute_subworkflow(≈ line 1367) —return await self._run_child_engine(child_engine, sub_inputs, agent). Returns only the output;child_engine.usage_trackeris dropped._execute_subworkflow_with_inputs(≈ line 1480) — capturesusage = child_engine.usage_tracker.get_summary()but the for-each caller (≈ line 5073) uses it only for afor_each_item_completedevent payload, never merging it intoself.usage_tracker.The final summary is built solely from the root engine's tracker:
Since no code path ever extends the parent tracker with child records, every sub-workflow agent is omitted from the printed total. (The live
--webdashboard looks roughly right because each child engine shares the parent'sevent_emitterand emits its ownagent_completedevents — only the aggregate is wrong, which makes this easy to misdiagnose.)Suggested fix
_run_child_engineis the single chokepoint both paths use, and it already receives the child engine. Merging there is enough and is transitive (a grandchild merges into its child, then that child — now carrying the grandchild's records — into the parent), with no double counting because the parent never records sub-workflow agents itself:(Using
finallyalso costs an expensive child that ultimately fails.)Bug 2 —
DEFAULT_PRICINGis missing current modelsconductor/engine/pricing.pyDEFAULT_PRICINGhas no entry forclaude-opus-4.8orgpt-5.5.get_pricingreturnsNonefor both (the--delimited versioned-suffix fallback doesn't match either), socalculate_costreturnsNoneand those agents are costed at$0. Worth noting the fuzzy fallback can also mis-price: a name likeclaude-opus-4.8would match the olderclaude-opus-4entry ($15/$75 per Mtok) if it wereclaude-opus-4-..., i.e. ~3× the real Opus 4.x rate — so silent fuzzy matching across model families is a sharp edge in its own right.Suggested fix
Add current entries (and consider sourcing this table so it doesn't go stale):
Repro
type: workflowstep whose sub-workflow invokes at least one agent.total_input_tokens/cost_usdacross allagent_completedevents (e.g. fromGET /api/stateon the--webdashboard, or the--weblogs export).Workaround
For anyone hitting this before a fix lands: both can be fixed in-process without editing site-packages, via a
sitecustomize.pyonPYTHONPATHthat (a) adds the missingDEFAULT_PRICINGentries in place and (b) wrapsWorkflowEngine._run_child_enginewith thefinally-merge above. Happy to send a PR if a maintainer confirms the chokepoint-merge approach is the preferred shape.