Phase 5 scaffold: real-API integration tests + CI workflow#7
Merged
Conversation
Phase 5 was blocked on user-provided keys. Ship the scaffold so the
moment a key lands in the env (locally or as a CI secret), every alpha
adapter gets verified against real endpoints with no further code work.
tests/integration/
- conftest.py: openai_key / anthropic_key fixtures that skip cleanly
when the env var is absent (so default `pytest tests/` stays free
and offline)
- test_openai_real.py: sync non-streaming, sync streaming (with
stream_options.include_usage), async non-streaming. Asserts
provider, model prefix, prompt_chars, input/output_tokens, cost_usd,
latency_ms, retry_count, plus token events on the streaming span.
- test_anthropic_real.py: same three modes against claude-haiku-4-5.
Each test uses the cheapest available model and max_tokens=10. A full
pass costs a fraction of a cent.
.github/workflows/ci-real-api.yml
- Separate workflow gated on workflow_dispatch + weekly cron (Tue 14:00 UTC)
- Two jobs: openai, anthropic — each reads its key from a repo secret
- If the secret isn't set, the job exits 0 cleanly (no false red)
- concurrency cancel-in-progress on the ref so a flaky upstream
doesn't queue retries
pyproject.toml
- Register the `integration` pytest marker so the suite stops warning
about unrecognized markers when the integration files are collected
README.md
- New "Tests" section with the three commands (default / Postgres /
real-API), positioned just before Pricing
Verified locally: `pytest -q tests/` exits 0 with 88 passed + 12
skipped (the 6 integration tests now skip cleanly without keys; the
existing skip count for Postgres unchanged).
To activate the real-API path:
Settings → Secrets → Actions → New repository secret
OPENAI_API_KEY, ANTHROPIC_API_KEY
Then trigger via the Actions tab → "ci-real-api" → "Run workflow"
or wait for the weekly cron.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Phase 5 (real-API verification) was the only remaining pending item from the original plan, blocked on user-provided OpenAI/Anthropic keys. This PR ships the scaffold so the moment a key lands, every alpha adapter gets verified against real endpoints — no further code work needed.
What
tests/integration/— three test files, all gated on env vars:test_openai_real.py— sync non-streaming, sync streaming (withstream_options.include_usage), async non-streaming. Assertsprovider,model,prompt_chars,input_tokens,output_tokens,cost_usd,latency_ms,retry_count, plus token events on the streaming span.test_anthropic_real.py— same three modes againstclaude-haiku-4-5.conftest.py—openai_key/anthropic_keyfixtures that skip cleanly when the env var is absent.Each test uses the cheapest model and
max_tokens=10. A full pass costs a fraction of a cent..github/workflows/ci-real-api.yml— separate workflow:workflow_dispatch+ weekly cron (Tue 14:00 UTC)openai,anthropic) read their keys from repo secretsconcurrency.cancel-in-progresson the ref so a flaky upstream doesn't queue retriesREADME.md— new "Tests" section explaining the three invocations:Verified locally
pytest -q tests/→ 88 passed, 12 skipped (6 new integration tests skip cleanly without keys; existing Postgres skips unchanged). No false failures.To activate real-API verification
Then trigger via the Actions tab → "ci-real-api" → "Run workflow", or wait for the weekly cron.
Test plan
ci-real-apiand confirm both jobs go green🤖 Generated with Claude Code