Skip to content

Native Gemini image generation through the TEE#81

Merged
adambalogh merged 8 commits into
mainfrom
claude/elegant-shannon-wRp7J
May 31, 2026
Merged

Native Gemini image generation through the TEE#81
adambalogh merged 8 commits into
mainfrom
claude/elegant-shannon-wRp7J

Conversation

@adambalogh
Copy link
Copy Markdown
Contributor

@adambalogh adambalogh commented May 30, 2026

Adds native, provider-side image generation routed through the TEE, so image output rides the existing OHTTP/HPKE + signing pipeline instead of the privacy-breaking DALL-E tool.

Model registry

  • Register gemini-3.1-flash-image ("nano banana 2") — the latest Gemini image model and the one the chat app now routes to — with image_output=True and per-token pricing ($0.50/MTok in, $60/MTok out; an image is ~1120 output tokens ≈ $0.067).
  • Also register gemini-2.5-flash-image (image_output=True, ~1290 tokens/image ≈ $0.039) as the prior image model.
  • The Google backend requests the IMAGE modality for image_output models.

Chat controller

  • Split generated image content blocks out of the text and surface them on the response message (non-streaming) and the SSE final frame (streaming).
  • Images are carried out-of-band and are NOT folded into the signed output hash — the OHTTP/HPKE channel already guarantees end-to-end confidentiality and integrity, while the TEE signature continues to cover the request and any text output.

Billing

Image generation is charged exactly like any other completion: Gemini reports the image as output tokens in candidates_token_count, which langchain-google-genai folds into usage_metadata.output_tokenscompletion_tokensoutput_price_usd. The image bytes themselves are never metered by size. This path (and the no-usage fail-open caveat, where a missing usage_metadata means the client isn't charged) is documented inline in the chat controller and covered by regression tests in test_image_billing.py.

Also folded in

Merges claude/nice-brown-d8XRm, which registers gemini-3.5-flash as a text model ($1.50/$9.00 per MTok). Note: 3.5 Flash is text-output only — it is not an image model, despite the name; image generation uses the *-image models above.

PCR note: changes here require an enclave rebuild + measurements.txt refresh; uv.lock is untouched.

claude and others added 3 commits May 30, 2026 01:19
Register gemini-3.5-flash (Google) with $1.50/$9.00 per MTok input/output
pricing. Adds lookup entry, a resolution test mirroring the other Gemini
models, and updates the supported-provider list in CLAUDE.md.
Register gemini-2.5-flash-image (image-capable Gemini) in the model
registry with an image_output flag and per-token pricing. The Google
backend requests the IMAGE modality for these models.

The chat controller splits generated image content blocks out of the
text and surfaces them on the response message (non-streaming) and the
SSE final frame (streaming). Images are carried out-of-band and are NOT
folded into the signed output hash — the OHTTP/HPKE channel already
guarantees end-to-end confidentiality and integrity, while the TEE
signature continues to cover the request and any text output.

https://claude.ai/code/session_01FfV7ArtyE3dr571hZQ4KXV
@adambalogh adambalogh marked this pull request as ready for review May 30, 2026 17:41
@adambalogh adambalogh changed the title Add native Gemini image generation through the TEE Native Gemini image generation through the TEE May 30, 2026
@adambalogh adambalogh requested a review from Copilot May 30, 2026 23:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds native Gemini image-generation support through the existing TEE chat pipeline, including model registration, Google backend modality selection, response image extraction, streaming/non-streaming surfacing, and billing regression coverage.

Changes:

  • Registers Gemini image-output models and Gemini 3.5 Flash in the model registry/pricing tests.
  • Requests image response modalities for Google image models.
  • Splits generated images from text responses and carries them out-of-band in chat responses/SSE final frames, with billing tests for image token accounting.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tee_gateway/model_registry.py Adds image-output model config support and new Gemini model entries.
tee_gateway/llm_backend.py Configures Google image models with TEXT and IMAGE response modalities.
tee_gateway/controllers/chat_controller.py Extracts generated image blocks and includes them in non-streaming/streaming responses.
tee_gateway/test/test_image_billing.py Adds regression tests for Gemini image token folding and cost calculation.
tests/test_pricing.py Adds pricing resolution tests for new Gemini models.
CLAUDE.md Updates documented Google model routing list.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tee_gateway/controllers/chat_controller.py Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@adambalogh adambalogh merged commit 65c6adf into main May 31, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants