Native Gemini image generation through the TEE#81
Merged
Conversation
Register gemini-3.5-flash (Google) with $1.50/$9.00 per MTok input/output pricing. Adds lookup entry, a resolution test mirroring the other Gemini models, and updates the supported-provider list in CLAUDE.md.
Register gemini-2.5-flash-image (image-capable Gemini) in the model registry with an image_output flag and per-token pricing. The Google backend requests the IMAGE modality for these models. The chat controller splits generated image content blocks out of the text and surfaces them on the response message (non-streaming) and the SSE final frame (streaming). Images are carried out-of-band and are NOT folded into the signed output hash — the OHTTP/HPKE channel already guarantees end-to-end confidentiality and integrity, while the TEE signature continues to cover the request and any text output. https://claude.ai/code/session_01FfV7ArtyE3dr571hZQ4KXV
…egistry # Conflicts: # tee_gateway/model_registry.py
Contributor
There was a problem hiding this comment.
Pull request overview
Adds native Gemini image-generation support through the existing TEE chat pipeline, including model registration, Google backend modality selection, response image extraction, streaming/non-streaming surfacing, and billing regression coverage.
Changes:
- Registers Gemini image-output models and Gemini 3.5 Flash in the model registry/pricing tests.
- Requests image response modalities for Google image models.
- Splits generated images from text responses and carries them out-of-band in chat responses/SSE final frames, with billing tests for image token accounting.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
tee_gateway/model_registry.py |
Adds image-output model config support and new Gemini model entries. |
tee_gateway/llm_backend.py |
Configures Google image models with TEXT and IMAGE response modalities. |
tee_gateway/controllers/chat_controller.py |
Extracts generated image blocks and includes them in non-streaming/streaming responses. |
tee_gateway/test/test_image_billing.py |
Adds regression tests for Gemini image token folding and cost calculation. |
tests/test_pricing.py |
Adds pricing resolution tests for new Gemini models. |
CLAUDE.md |
Updates documented Google model routing list. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds native, provider-side image generation routed through the TEE, so image output rides the existing OHTTP/HPKE + signing pipeline instead of the privacy-breaking DALL-E tool.
Model registry
gemini-3.1-flash-image("nano banana 2") — the latest Gemini image model and the one the chat app now routes to — withimage_output=Trueand per-token pricing ($0.50/MTok in, $60/MTok out; an image is ~1120 output tokens ≈ $0.067).gemini-2.5-flash-image(image_output=True, ~1290 tokens/image ≈ $0.039) as the prior image model.image_outputmodels.Chat controller
Billing
Image generation is charged exactly like any other completion: Gemini reports the image as output tokens in
candidates_token_count, whichlangchain-google-genaifolds intousage_metadata.output_tokens→completion_tokens→output_price_usd. The image bytes themselves are never metered by size. This path (and the no-usage fail-open caveat, where a missingusage_metadatameans the client isn't charged) is documented inline in the chat controller and covered by regression tests intest_image_billing.py.Also folded in
Merges
claude/nice-brown-d8XRm, which registersgemini-3.5-flashas a text model ($1.50/$9.00 per MTok). Note: 3.5 Flash is text-output only — it is not an image model, despite the name; image generation uses the*-imagemodels above.PCR note: changes here require an enclave rebuild +
measurements.txtrefresh;uv.lockis untouched.