Skip to content

Add host capability probe so resolver actually runs in production#1075

Merged
joelteply merged 1 commit into
canaryfrom
feat/host-capability-probe
May 11, 2026
Merged

Add host capability probe so resolver actually runs in production#1075
joelteply merged 1 commit into
canaryfrom
feat/host-capability-probe

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Position 1 PR-2 — production HostCapability detection

Built on top of #1074. Without this, every caller of ModelRequirement::standard_persona(host) constructs HostCapability by hand — the resolver is callable but not used in production.

Scope

cognition::host_capability_probe::detect_host_capability(gpu_monitor, system_info) returns HostCapability or typed ProbeError:

  • metalUnifiedMemory + Apple-Silicon tier from CPU brand + memory bucket
  • cudaGpu + Sm70..Sm120 tier from device-name (RTX 5090 → Sm120, H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, B100 → Sm100)
  • vulkanGpu + VulkanAmd
  • mockM1Uma16Gb (tests)
  • unknown platform → ProbeError::UnsupportedPlatform
  • unknown CUDA SKU → ProbeError::UnknownGpuDevice (no silent CpuOnly fallback per Add sensory-bar requirement profile to model resolver (Position 1, PR #1072) #1074's NO COMPROMISE bar)

Pattern-ordering note: A100 must precede A10/A40 in the match chain — "A10" substring of "A100". Test nvidia_pattern_match_resolves_known_skus pins this regression.

Validation

  • cargo test --features metal,accelerate -p continuum-core --lib cognition::host_capability_probe: 6/6 pass
  • npx tsx scripts/build-with-loud-failure.ts: TypeScript clean
  • normal precommit + prepush passed (no --no-verify)

Out of scope (followup PRs)

  • Wire detect_host_capability() into server boot so HostCapability becomes a runtime singleton
  • Re-detect on hardware-change events (battery, thermal throttle)
  • Memory-share heuristic tuning (currently total / 2; right number needs adaptive_throughput lease coordination)

Stack ordering

Logically depends on #1074 (uses HostCapability/HwCapabilityTier types). HostCapability itself shipped earlier in #1066, so this PR compiles against canary today. When #1074 lands, standard_persona(detect_host_capability(...).unwrap()) becomes the natural production call.

🤖 Generated with Claude Code

Position 1 PR #1074 shipped the typed primitive (standard_persona(host)).
Without a probe, every caller has to construct HostCapability by hand —
the resolver is callable but not used. This is the production probe.

cognition/host_capability_probe.rs (pure, single file, ~270 lines):
- detect_host_capability(gpu_monitor: &dyn GpuMonitor, system_info: &System)
  -> Result<HostCapability, ProbeError>
- Maps GpuMonitor::platform to TargetSilicon and dispatches device-name
  pattern-matching:
  * metal → UnifiedMemory + Apple-Silicon tier (M1Uma8Gb, M1Uma16Gb,
    M2UmaProMax, M3UmaProMax) from CPU brand + total memory bucket
  * cuda → Gpu + Sm70..Sm120 tier from device-name (RTX 5090 → Sm120,
    H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, etc.)
  * vulkan → Gpu + VulkanAmd
  * mock → M1Uma16Gb (test fixture)
- ProbeError variants:
  * UnknownGpuDevice{platform, device_name} — pattern-match miss; loud
    fail per Joel's NO COMPROMISE rule (no silent CpuOnly fallback)
  * UnsupportedPlatform{platform} — fires when GpuMonitor reports an
    unrecognized platform string

Pattern-ordering is load-bearing in nvidia_sm_tier(): A100 must be
checked before A10/A40 because "A10" is a substring of "A100" — the
tests cover this regression vector explicitly. Comment in the source
calls it out.

Tests: 6/6 cognition::host_capability_probe pass:
- mock_platform_returns_test_fixture
- unsupported_platform_errors_loudly
- nvidia_pattern_match_resolves_known_skus (9 device fixtures)
- nvidia_unknown_sku_errors_no_silent_fallback
- apple_silicon_tier_mapping
- export_bindings_probeerror

Validation:
- cargo test --features metal,accelerate -p continuum-core --lib
  cognition::host_capability_probe: 6/6
- npx tsx scripts/build-with-loud-failure.ts: TypeScript clean

Out of scope (separate followups):
- Wiring detect_host_capability() into the actual server boot path so
  HostCapability becomes a runtime singleton callers can read
- Re-detect on hardware-change events (battery, thermal throttle)
- Memory-share heuristic (currently total_mem / 2; the right number
  needs adaptive_throughput integration to coordinate with leases)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply joelteply marked this pull request as ready for review May 11, 2026 17:34
@joelteply joelteply merged commit e93024d into canary May 11, 2026
2 checks passed
@joelteply joelteply deleted the feat/host-capability-probe branch May 11, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant