Skip to content

agent: document multicast device capacity and add capacity eval#654

Open
armcconnell wants to merge 1 commit into
mainfrom
fix/agent-multicast-capacity
Open

agent: document multicast device capacity and add capacity eval#654
armcconnell wants to merge 1 commit into
mainfrom
fix/agent-multicast-capacity

Conversation

@armcconnell

Copy link
Copy Markdown
Contributor

Resolves: #651

Summary of Changes

  • Add a "Device multicast capacity" section to the agent's SQL_CONTEXT.md so it can answer questions like "which DZDs have capacity for new subscribers". Previously the agent had no definition of device capacity and would reason over the stale on-chain multicast_subscribers_count / multicast_publishers_count columns on dz_devices_current, producing nonsensical answers.
  • The new guidance: marks the on-chain device count columns as unreliable; defines current subscribers/publishers as activated multicast users on the device with a non-empty subscribers/publishers array (from dz_users_current); defines available capacity as max_multicast_subscribers - current, only meaningful when max_multicast_subscribers > 0 (0 means unknown, not full); and includes a canonical "which devices have capacity" query.
  • Add the MulticastDeviceCapacity eval. It seeds devices whose stale on-chain counts deliberately disagree with their attached users (a: on-chain 0 vs live 2; c: on-chain 8 vs live 0), so the eval fails if the agent reads the device count columns instead of counting live users.

This is the agent-facing half of the multicast count problem. The data/display half is #650 (PR #653).

Diff Breakdown

Category Files Lines (+/-) Net
Docs (prompt) 1 +~35 / -1 +~34
Tests (eval) 1 +~185 / -0 +~185

Prompt guidance plus a new eval; no production code paths change.

Testing Verification

  • MulticastDeviceCapacity eval passes against the live agent (claude-haiku-4-5): the agent generated a query counting live subscribers via countIf(JSONLength(subscribers) > 0) and computing max - current, reported nyc-dzd-c (10 available) and nyc-dzd-a (3 available), correctly excluded the full nyc-dzd-b, and used live counts (a=2, c=0) rather than the stale on-chain values (a=0, c=8). The eval's LLM judge passed all three expectations.
  • The eval's ground-truth ClickHouse query is independently asserted (a=3, b=0, c=10 available) so the scenario is verified even in short mode.

The agent had no guidance on device-level multicast capacity and would reason
over the stale on-chain multicast_subscribers_count / multicast_publishers_count
columns on dz_devices_current, producing nonsensical answers to questions like
"which DZDs have capacity for new subscribers" (#651).

Add a "Device multicast capacity" section to SQL_CONTEXT.md that:
- marks the on-chain device count columns as stale and not to be used for counts
- defines current subscribers/publishers as activated multicast users on the
  device with a non-empty subscribers/publishers array (from dz_users_current)
- defines available capacity as max_multicast_subscribers - current, only when
  max_multicast_subscribers > 0 (0 means unknown, not full)
- includes a canonical "which devices have capacity" query

Add the MulticastDeviceCapacity eval, which seeds devices whose stale on-chain
counts deliberately disagree with their attached users, so the eval fails if the
agent reads the device count columns instead of counting live users.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

agent: nonsensical answer when asked which DZDs have capacity for new subscribers

1 participant