T041 + T042: LLM selector, StubLLMProvider, and generate-schemas by likith1908 · Pull Request #40 · auropro-hyd/IRIS

likith1908 · 2026-06-17T07:01:46Z

Summary

Implements T041 (LLM provider selector with fallback) and T042 (StubLLMProvider for tests) as a combined PR on top of T040. Also ships the make generate-schemas target and the CLI improvement that surfaced during review.

iris_engine/llm/selector.py: select_llm_provider() returns the configured provider, optionally wrapped in _FallbackLLMProvider. Fallback fires only on LLMUnavailable; LLMRateLimited surfaces to the caller since rate limiting on the primary does not mean the fallback can absorb the load.
iris_engine/llm/in_memory.py: StubLLMProvider returns canned LLMResponse values, supports a structured_instances registry for schema-based responses, raises LLMSchemaViolation when a schema is requested but no instance is registered, and accepts raise_on_complete for error injection. Full OTEL instrumentation via instrument_complete.
iris_engine/llm/tracing.py: shared instrument_complete() async context manager and log_complete_success() structured log emitter. Mirrors ocr/tracing.py pattern exactly.
iris_engine/llm/__init__.py: public re-exports for the llm subsystem.
iris_config/schema/adapters.py: llm_fallback: LlmAdapter | None = None added to AdaptersSchema. Module docstring now documents the iris config schema regeneration step.
docs/schemas/product.schema.json: regenerated to include llm_fallback.
Makefile: make generate-schemas target regenerates all three docs/schemas/*.json files from the Pydantic models in one command.
iris config schema: only writes the file when content actually changed; output now prefixed with [model] and distinguishes written vs up-to-date.

Task reference

Task IDs: T041, T042
Workstream: 004-llm-adapter-set

Acceptance criteria

T041

select_llm_provider() returns primary when no fallback configured
Primary LLMUnavailable falls back to secondary
Both fail surfaces primary error
Unknown adapter or fallback raises KeyError

T042

Returns LLMResponse with correct adapter_id, usage math, and text
Schema provided + registered instance returns structured field populated
Schema provided + no registered instance raises LLMSchemaViolation
No schema returns structured = None
raise_on_complete injects typed errors for test scenarios

PR checklist

Every acceptance criterion in the task's tasks.md entry is satisfied.
docs-ci workflow passes (markdown lint + tasks structural check).
No em-dashes introduced in new prose.
All internal links resolve.
No secrets, credentials, or personal names introduced.

Notes for the reviewer

OTEL span attribute tests deferred to T048: The unit test file previously set a module-level set_tracer_provider() which overwrote the OCR contract conftest's provider, causing 5 false failures in test_c011_otel_span_carries_tenant_id. Span attribute verification belongs in the contract suite (T048), consistent with how C-OCR-011 is handled. The tracing code paths (success, LLMError, generic exception) are all covered structurally through the stub's complete() call - 100% coverage on tracing.py.

JSON schema regeneration had no enforced workflow: When AdaptersSchema gained llm_fallback, docs/schemas/product.schema.json was silently out of date - caught only by manual inspection. Fixed in this PR: the adapters.py docstring documents the command, make generate-schemas runs all three regenerations in one step, and the CLI now only writes and reports when content actually changed.

_FallbackLLMProvider exposes primary .id: The fallback wrapper's .id is the primary adapter's ID so callers and telemetry always see the configured adapter, not the fallback.

Rebase note

This PR is stacked on #39 (T040). Once #39 merges, rebase onto main before merging this one.

…041+T042) - iris_engine/llm/selector.py: select_llm_provider() with _FallbackLLMProvider that fires only on LLMUnavailable, re-raising the primary error when both fail - iris_engine/llm/in_memory.py: StubLLMProvider with structured-output registry, error injection, and full OTEL span instrumentation - iris_engine/llm/tracing.py: instrument_complete() async context manager that sets llm.* span attributes, and log_complete_success() structured log emitter - iris_engine/llm/__init__.py: public re-exports for the llm subsystem - iris_config/schema/adapters.py: add optional llm_fallback field to AdaptersSchema - docs/schemas/product.schema.json: regenerate to include llm_fallback - 20 unit tests, 394 total passing, 96.46% coverage, mypy --strict clean

When AdaptersSchema gains a new field the JSON schema in docs/schemas/ must be regenerated manually. The docstring now points to the exact command so the step is visible at the point of change.

- Makefile: generate-schemas target regenerates all three docs/schemas/*.json files from the Pydantic models in one command - iris config schema: only writes the file when content changed; output now prefixed with [model] and distinguishes written vs up-to-date

anmolg1997

Approved. The selector and stub are textbook — every pattern lesson from the OCR sprint is baked in from the first commit, and you closed two of my earlier non-blocking notes along the way. Same one tracking fix as #39 before merge.

What I verified

Selector (iris_engine/llm/selector.py, T041)

Mirrors ocr/selector.py exactly: plain-string registry lookup (engine never imports iris_config — layering docstring states it, lint confirms 3 contracts kept), _FallbackLLMProvider wrapper preserving primary id/version.
Fallback fires only on LLMUnavailable, surfaces the primary error if both fail, and the docstring documents why LLMRateLimited deliberately does NOT fall back — the exact reasoning from the OCR review, applied here without being asked. That's the pattern sticking.

Stub (iris_engine/llm/in_memory.py, T042)

structured_instances: dict[type[BaseModel], BaseModel] registry keyed by schema type → returns the registered instance when a schema is requested, raises LLMSchemaViolation with a clear message when none is registered. This is exactly the seam workstream 005's agent tests will lean on.
raise_on_complete: LLMError | None error-injection seam — lets 005's tests exercise every typed-error branch without standing up a real provider.
Full OTEL instrumentation via instrument_complete, structured_output_used attribute computed from the real outcome. usage math defaults satisfy the contract's total == input + output invariant (10+5=15).

Tracing (iris_engine/llm/tracing.py)

Faithful mirror of ocr/tracing.py: shared async CM, typed-error categorisation, structured success log. The two tracing modules are now consistent siblings — when the real LLM adapters land in T044-T047 they wrap instrument_complete the same way the OCR adapters wrap instrument_extract.

Two earlier notes closed (appreciated)

make test-cov is now pytest --cov --cov-report=... --cov-fail-under=95 — reads source from pyproject config instead of repeating --cov= per package. That's the dual-source-of-truth cleanup I flagged on #36. One place now.
make generate-schemas + idempotent iris config schema (writes only on change, [model] written vs up-to-date) addresses the schema-artifact drift note from #37. llm_fallback added to AdaptersSchema and the regenerated product.schema.json is in the diff — schema and artifact in sync in the same PR.

Gates: 21 selector/stub tests pass, full suite 394, mypy clean on 24 source files, lint clean.

Same fix before merge as #39

tasks/004-llm-adapter-set/tasks.md still shows - [ ] **T041** and - [ ] **T042**. Please flip both to [x].

This is now two PRs in a row (#39, #40) landing with the checkbox unflipped — you were consistent on this through all of WS001-003, so it's a small slip, not a habit, but worth catching before it becomes one. The tasks.md checkboxes are how the team (and Akhilesh, when he asks "what's done") reads progress at a glance; a merged-but-unchecked task makes main lie about its own state. Quick habit to re-anchor: flip the box in the same commit that implements the task, before opening the PR.

Both #39 and #40 are rebasing as the stack lands anyway, so the three flips (T040, T041, T042) cost one extra line each on a push you're already making.

Status

With this, WS004's foundation (Protocol + selector + stub + tracing) is in — same shape that made the OCR adapters fast to write. Remaining in WS004: the four real adapters (T044 Azure OpenAI, T045 OpenAI, T046 Anthropic, T047 local) on the shared OpenAI-compatible base (T043), the contract suite (T048), OTEL (T049 — though you've effectively pre-wired it), and READMEs (T050). The llm-shared base class (T043) is the natural next PR since three of the four adapters inherit from it.

likith1908 added 3 commits June 17, 2026 12:28

Document iris config schema regeneration step in adapters.py docstring

ffa2803

When AdaptersSchema gains a new field the JSON schema in docs/schemas/ must be regenerated manually. The docstring now points to the exact command so the step is visible at the point of change.

likith1908 changed the title ~~T041 + T042: LLM selector and StubLLMProvider~~ T041 + T042: LLM selector, StubLLMProvider, and generate-schemas Jun 17, 2026

likith1908 marked this pull request as ready for review June 17, 2026 07:34

likith1908 requested a review from anmolg1997 as a code owner June 17, 2026 07:34

This was referenced Jun 17, 2026

T038 + T039: OTEL span instrumentation and per-adapter READMEs #38

Open

T040: LLMProvider Protocol, request/response types, and error hierarchy #39

Open

anmolg1997 approved these changes Jun 17, 2026

View reviewed changes

Flip T041 and T042 checkboxes to done in tasks.md

a43d3a7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

T041 + T042: LLM selector, StubLLMProvider, and generate-schemas#40

T041 + T042: LLM selector, StubLLMProvider, and generate-schemas#40
likith1908 wants to merge 4 commits into
T040-llm-provider-protocolfrom
T041-T042-llm-selector-stub

likith1908 commented Jun 17, 2026 •

edited

Loading

Uh oh!

anmolg1997 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

likith1908 commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Task reference

Acceptance criteria

PR checklist

Notes for the reviewer

Rebase note

Uh oh!

anmolg1997 left a comment

Choose a reason for hiding this comment

What I verified

Same fix before merge as #39

Status

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

likith1908 commented Jun 17, 2026 •

edited

Loading