Sub-workflow agent usage/cost is omitted from the parent workflow's Token Usage Summary (+ missing pricing for claude-opus-4.8 / gpt-5.5)

## Summary

The end-of-run **Token Usage Summary / Cost Breakdown** under-reports badly for any workflow that uses `type: workflow` sub-workflow steps, because **a child sub-workflow's `UsageTracker` is never merged back into the parent**. The printed total only reflects agents the *root* engine recorded directly; every sub-workflow agent is invisible.

On a real run with 3 sub-workflow agents, the summary printed **116,529 input tokens / $0.43** when the actual figure (summed from the `agent_completed` events) was **462,844 input tokens** — the `analyze` (119k), `fix_code` (135k), and `checker` (92k) agents were all missing from the aggregate.

A second, independent bug compounds it: **two current models are absent from `DEFAULT_PRICING`**, so even where their tokens *are* counted they're costed at `$0`.

Version: `conductor-cli v0.1.19`.

## Bug 1 — sub-workflow usage is never rolled into the parent

`type: workflow` steps run in a child `WorkflowEngine` with its own `UsageTracker`. The child is run via `_run_child_engine`, which both sub-workflow entry points funnel through:

- `_execute_subworkflow` (≈ line 1367) — `return await self._run_child_engine(child_engine, sub_inputs, agent)`. Returns **only** the output; `child_engine.usage_tracker` is dropped.
- `_execute_subworkflow_with_inputs` (≈ line 1480) — captures `usage = child_engine.usage_tracker.get_summary()` but the for-each caller (≈ line 5073) uses it **only** for a `for_each_item_completed` event payload, never merging it into `self.usage_tracker`.

The final summary is built solely from the root engine's tracker:

```python
# get_execution_summary(), ≈ line 5577
usage = self.usage_tracker.get_summary()
summary["usage"] = { "total_input_tokens": usage.total_input_tokens, ... }
```

Since no code path ever extends the parent tracker with child records, every sub-workflow agent is omitted from the printed total. (The live `--web` dashboard looks roughly right because each child engine shares the parent's `event_emitter` and emits its own `agent_completed` events — only the *aggregate* is wrong, which makes this easy to misdiagnose.)

### Suggested fix

`_run_child_engine` is the single chokepoint both paths use, and it already receives the child engine. Merging there is enough and is transitive (a grandchild merges into its child, then that child — now carrying the grandchild's records — into the parent), with no double counting because the parent never records sub-workflow agents itself:

```python
async def _run_child_engine(self, child_engine, sub_inputs, agent):
    try:
        return await self._orig_run_child_engine(child_engine, sub_inputs, agent)
    finally:
        # Roll the sub-workflow's per-agent usage into the parent so the
        # final summary + cost breakdown include sub-workflow agents.
        self.usage_tracker._agents.extend(
            child_engine.usage_tracker.get_summary().agents
        )
```

(Using `finally` also costs an expensive child that ultimately fails.)

## Bug 2 — `DEFAULT_PRICING` is missing current models

`conductor/engine/pricing.py` `DEFAULT_PRICING` has no entry for `claude-opus-4.8` or `gpt-5.5`. `get_pricing` returns `None` for both (the `-`-delimited versioned-suffix fallback doesn't match either), so `calculate_cost` returns `None` and those agents are costed at `$0`. Worth noting the fuzzy fallback can also *mis*-price: a name like `claude-opus-4.8` would match the older `claude-opus-4` entry ($15/$75 per Mtok) if it were `claude-opus-4-...`, i.e. ~3× the real Opus 4.x rate — so silent fuzzy matching across model families is a sharp edge in its own right.

### Suggested fix

Add current entries (and consider sourcing this table so it doesn't go stale):

```python
"claude-opus-4.8": ModelPricing(input_per_mtok=5.00, output_per_mtok=25.00,
                                cache_read_per_mtok=0.50, cache_write_per_mtok=6.25),
"gpt-5.5": ModelPricing(input_per_mtok=2.00, output_per_mtok=8.00),
```

## Repro

1. Run any workflow with a `type: workflow` step whose sub-workflow invokes at least one agent.
2. Compare the printed "Token Usage Summary" totals against the sum of `total_input_tokens` / `cost_usd` across all `agent_completed` events (e.g. from `GET /api/state` on the `--web` dashboard, or the `--web` logs export).
3. The summary totals exclude every sub-workflow agent.

## Workaround

For anyone hitting this before a fix lands: both can be fixed in-process without editing site-packages, via a `sitecustomize.py` on `PYTHONPATH` that (a) adds the missing `DEFAULT_PRICING` entries in place and (b) wraps `WorkflowEngine._run_child_engine` with the `finally`-merge above. Happy to send a PR if a maintainer confirms the chokepoint-merge approach is the preferred shape.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sub-workflow agent usage/cost is omitted from the parent workflow's Token Usage Summary (+ missing pricing for claude-opus-4.8 / gpt-5.5) #266

Summary

Bug 1 — sub-workflow usage is never rolled into the parent

Suggested fix

Bug 2 — `DEFAULT_PRICING` is missing current models

Suggested fix

Repro

Workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Sub-workflow agent usage/cost is omitted from the parent workflow's Token Usage Summary (+ missing pricing for claude-opus-4.8 / gpt-5.5) #266

Description

Summary

Bug 1 — sub-workflow usage is never rolled into the parent

Suggested fix

Bug 2 — DEFAULT_PRICING is missing current models

Suggested fix

Repro

Workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bug 2 — `DEFAULT_PRICING` is missing current models