Skip to content

Add native web search with per-search billing#83

Merged
adambalogh merged 2 commits into
mainfrom
claude/web-search-all-models-WUZcG
May 31, 2026
Merged

Add native web search with per-search billing#83
adambalogh merged 2 commits into
mainfrom
claude/web-search-all-models-WUZcG

Conversation

@adambalogh
Copy link
Copy Markdown
Contributor

Add an opt-in web_search flag to chat and text completion requests that
enables each provider's built-in web search and bills per search on top of
token usage.

  • model_registry: per-search USD pricing (provider list prices) via
    get_web_search_price_usd + provider_supports_web_search predicate
  • llm_backend: get_chat_model_cached gains a web_search arg (OpenAI uses the
    Responses API, xAI sets search_parameters); get_web_search_tool returns the
    bind-time tool spec for OpenAI/Anthropic/Google; extract_web_search_count
    counts the billable unit per provider (calls/requests/sources/grounded req)
  • controllers: parse and forward the flag, bind the web search tool alongside
    user tools, cover the flag in the signed request hash, and pass the detected
    search count into compute_session_cost (streaming + non-streaming)
  • pricing: compute_session_cost adds the per-search surcharge
  • models + openapi: add the web_search request field
  • tests: provider pricing, tool specs, count extraction, surcharge math, and
    controller binding/billing integration

Providers without native web search (ByteDance/ModelArk) ignore the flag and
are not charged.

Add an opt-in `web_search` flag to chat and text completion requests that
enables each provider's built-in web search and bills per search on top of
token usage.

- model_registry: per-search USD pricing (provider list prices) via
  get_web_search_price_usd + provider_supports_web_search predicate
- llm_backend: get_chat_model_cached gains a web_search arg (OpenAI uses the
  Responses API, xAI sets search_parameters); get_web_search_tool returns the
  bind-time tool spec for OpenAI/Anthropic/Google; extract_web_search_count
  counts the billable unit per provider (calls/requests/sources/grounded req)
- controllers: parse and forward the flag, bind the web search tool alongside
  user tools, cover the flag in the signed request hash, and pass the detected
  search count into compute_session_cost (streaming + non-streaming)
- pricing: compute_session_cost adds the per-search surcharge
- models + openapi: add the web_search request field
- tests: provider pricing, tool specs, count extraction, surcharge math, and
  controller binding/billing integration

Providers without native web search (ByteDance/ModelArk) ignore the flag and
are not charged.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…-models-WUZcG

# Conflicts:
#	tee_gateway/controllers/chat_controller.py
#	tee_gateway/model_registry.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated no new comments.

@adambalogh adambalogh merged commit 265cfad into main May 31, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants