Add host capability probe so resolver actually runs in production#1075
Merged
Conversation
Position 1 PR #1074 shipped the typed primitive (standard_persona(host)). Without a probe, every caller has to construct HostCapability by hand — the resolver is callable but not used. This is the production probe. cognition/host_capability_probe.rs (pure, single file, ~270 lines): - detect_host_capability(gpu_monitor: &dyn GpuMonitor, system_info: &System) -> Result<HostCapability, ProbeError> - Maps GpuMonitor::platform to TargetSilicon and dispatches device-name pattern-matching: * metal → UnifiedMemory + Apple-Silicon tier (M1Uma8Gb, M1Uma16Gb, M2UmaProMax, M3UmaProMax) from CPU brand + total memory bucket * cuda → Gpu + Sm70..Sm120 tier from device-name (RTX 5090 → Sm120, H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, etc.) * vulkan → Gpu + VulkanAmd * mock → M1Uma16Gb (test fixture) - ProbeError variants: * UnknownGpuDevice{platform, device_name} — pattern-match miss; loud fail per Joel's NO COMPROMISE rule (no silent CpuOnly fallback) * UnsupportedPlatform{platform} — fires when GpuMonitor reports an unrecognized platform string Pattern-ordering is load-bearing in nvidia_sm_tier(): A100 must be checked before A10/A40 because "A10" is a substring of "A100" — the tests cover this regression vector explicitly. Comment in the source calls it out. Tests: 6/6 cognition::host_capability_probe pass: - mock_platform_returns_test_fixture - unsupported_platform_errors_loudly - nvidia_pattern_match_resolves_known_skus (9 device fixtures) - nvidia_unknown_sku_errors_no_silent_fallback - apple_silicon_tier_mapping - export_bindings_probeerror Validation: - cargo test --features metal,accelerate -p continuum-core --lib cognition::host_capability_probe: 6/6 - npx tsx scripts/build-with-loud-failure.ts: TypeScript clean Out of scope (separate followups): - Wiring detect_host_capability() into the actual server boot path so HostCapability becomes a runtime singleton callers can read - Re-detect on hardware-change events (battery, thermal throttle) - Memory-share heuristic (currently total_mem / 2; the right number needs adaptive_throughput integration to coordinate with leases) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Position 1 PR-2 — production HostCapability detection
Built on top of #1074. Without this, every caller of
ModelRequirement::standard_persona(host)constructsHostCapabilityby hand — the resolver is callable but not used in production.Scope
cognition::host_capability_probe::detect_host_capability(gpu_monitor, system_info)returnsHostCapabilityor typedProbeError:UnifiedMemory+ Apple-Silicon tier from CPU brand + memory bucketGpu+ Sm70..Sm120 tier from device-name (RTX 5090 → Sm120, H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, B100 → Sm100)Gpu+VulkanAmdM1Uma16Gb(tests)ProbeError::UnsupportedPlatformProbeError::UnknownGpuDevice(no silent CpuOnly fallback per Add sensory-bar requirement profile to model resolver (Position 1, PR #1072) #1074's NO COMPROMISE bar)Pattern-ordering note: A100 must precede A10/A40 in the match chain —
"A10"substring of"A100". Testnvidia_pattern_match_resolves_known_skuspins this regression.Validation
cargo test --features metal,accelerate -p continuum-core --lib cognition::host_capability_probe: 6/6 passnpx tsx scripts/build-with-loud-failure.ts: TypeScript clean--no-verify)Out of scope (followup PRs)
detect_host_capability()into server boot soHostCapabilitybecomes a runtime singletontotal / 2; right number needs adaptive_throughput lease coordination)Stack ordering
Logically depends on #1074 (uses
HostCapability/HwCapabilityTiertypes).HostCapabilityitself shipped earlier in #1066, so this PR compiles against canary today. When #1074 lands,standard_persona(detect_host_capability(...).unwrap())becomes the natural production call.🤖 Generated with Claude Code