Skip to content

feat: add Baidu AI Search backend for web_search#2371

Open
jimmyzhuu wants to merge 4 commits into
Hmbown:mainfrom
jimmyzhuu:feat/baidu-search-provider
Open

feat: add Baidu AI Search backend for web_search#2371
jimmyzhuu wants to merge 4 commits into
Hmbown:mainfrom
jimmyzhuu:feat/baidu-search-provider

Conversation

@jimmyzhuu
Copy link
Copy Markdown

@jimmyzhuu jimmyzhuu commented May 30, 2026

Summary

This PR adds Baidu AI Search as an explicit web_search backend so users in mainland China can choose a first-party, China-accessible search API instead of relying only on HTML scraping or providers that may be unreliable from Chinese networks.

What changed:

  • Adds SearchProvider::Baidu with config/env selection through:
    • [search] provider = "baidu"
    • DEEPSEEK_SEARCH_PROVIDER=baidu
    • aliases such as baidu-search and baidu_ai_search
  • Uses Baidu AI Search at https://qianfan.baidubce.com/v2/ai_search/web_search.
  • Sends the official Baidu web-search payload shape, including search_source: "baidu_search_v2" and resource_type_filter[].top_k.
  • Reads the API key from the normal search config key path, with Baidu-specific fallback support:
    • [search] api_key
    • DEEPSEEK_SEARCH_API_KEY
    • BAIDU_SEARCH_API_KEY
  • Normalizes Baidu references[] items into the existing ranked WebSearchResponse shape.
  • Adds the required network-policy host for qianfan.baidubce.com.
  • Keeps missing-key behavior explicit: selecting provider = "baidu" without a key returns a clear error instead of silently falling back to another provider and sending the query elsewhere.
  • Redacts bearer tokens from HTTP error previews before surfacing provider failures.
  • Documents Baidu search setup in the sample config and tool/config docs.

Related context:

Testing

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features
  • cargo test --workspace --all-features
  • cargo test -p codewhale-tui baidu
    • 7 passed, including the request-payload test that locks search_source to baidu_search_v2.
  • cargo test -p codewhale-tui sanitize_error_body_redacts_bearer_tokens
  • cargo test -p codewhale-tui tools::web_search::tests
    • 33 passed.
  • cargo check --workspace --all-features
  • Live smoke against Baidu AI Search with BAIDU_SEARCH_API_KEY
    • Endpoint: /v2/ai_search/web_search
    • Payload included search_source: "baidu_search_v2"
    • Query: Rust cargo workspace
    • Result: HTTP 200, 3 references[] returned.
  • Secret scan for accidentally committed Baidu credentials:
    • rg -n 'bce-v3|ALTAK|6edb0bd1c|BAIDU_SEARCH_API_KEY=.*[A-Za-z0-9]' . -S
    • No matches.

Known baseline issue:

  • cargo clippy --workspace --all-targets --all-features -- -D warnings currently fails on pre-existing warnings outside this PR:
    • crates/tui/src/commands/config.rs:476 useless format!
    • crates/tui/src/runtime_log.rs:177 redundant closure

Checklist

  • Updated docs or comments as needed
  • Added or updated tests where relevant
  • Verified TUI behavior manually if UI changes
    • No UI behavior changed; verified the backend with a live Baidu API smoke test instead.

Greptile Summary

Adds SearchProvider::Baidu as a selectable web_search backend targeting the Baidu AI Search API at qianfan.baidubce.com, with API-key resolution via config, DEEPSEEK_SEARCH_API_KEY, and the Baidu-specific BAIDU_SEARCH_API_KEY fallback. The PR also backfills the previously documented-but-unimplemented DEEPSEEK_SEARCH_API_KEY env override in apply_env_overrides, fixes the pre-existing missing "metaso" arm in SearchProvider::parse(), and applies bearer-token redaction to error bodies globally across all providers.

  • New Baidu backend (run_baidu_search): builds the official search_source: "baidu_search_v2" payload, normalises references[] into the common WebSearchEntry shape, distinguishes 401/403/429/generic error codes, and checks the network policy before issuing the request.
  • Config / env additions: SearchProvider::Baidu deserialises from "baidu" (snake_case), "baidu-search", and "baidu_ai_search" via serde; parse() covers additional aliases used in DEEPSEEK_SEARCH_PROVIDER env var resolution.
  • Security improvement: sanitize_error_body now redacts Bearer <token> patterns from all provider error previews, not just Baidu.

Confidence Score: 5/5

The change is additive and self-contained — all existing provider paths are untouched, the new Baidu path only activates when explicitly selected, and the bearer-token redaction is applied as a pure enhancement to the shared sanitisation helper.

The Baidu implementation closely mirrors the Metaso path, the key-resolution chain matches the documented priority order, network policy is gated before the HTTP call, and the PR ships solid unit test coverage including payload shape, missing-key error, result parsing, and token redaction. The only finding is an asymmetry between a few undocumented parse() aliases and serde aliases that does not affect any documented usage.

The serde aliases on SearchProvider::Baidu in config.rs are slightly narrower than the parse() aliases; worth a one-line fix but does not block merging.

Important Files Changed

Filename Overview
crates/tui/src/tools/web_search.rs Adds Baidu AI Search backend with API key resolution, payload construction, result parsing, and bearer-token redaction in error bodies; minor alias inconsistency between parse() and serde for undocumented variants.
crates/tui/src/config.rs Adds SearchProvider::Baidu with serde aliases and parse() aliases; implements DEEPSEEK_SEARCH_API_KEY env override in apply_env_overrides (previously documented but not applied); serde aliases and parse() aliases are slightly asymmetric for undocumented variants.
crates/tui/src/core/engine.rs Documentation-only update to field comments reflecting the new Baidu provider; no logic changes.
crates/tui/src/tools/spec.rs Documentation-only update to field comments to mention Baidu; no logic changes.
config.example.toml Adds Baidu to the web search provider comments and documents BAIDU_SEARCH_API_KEY and METASO_API_KEY env var overrides.
docs/CONFIGURATION.md Adds Baidu provider documentation; TOML snippet updated to show baidu as an example provider.
docs/TOOL_SURFACE.md Single-line update adding Baidu (and Metaso) to the web_search tool description row.

Sequence Diagram

sequenceDiagram
    participant Agent
    participant WebSearchTool
    participant NetworkPolicy
    participant BaiduAPI as Baidu AI Search (qianfan.baidubce.com)

    Agent->>WebSearchTool: "execute({query, max_results})"
    WebSearchTool->>WebSearchTool: resolve provider (config / DEEPSEEK_SEARCH_PROVIDER)
    alt "provider == Baidu"
        WebSearchTool->>NetworkPolicy: check_policy(qianfan.baidubce.com)
        NetworkPolicy-->>WebSearchTool: Allow / Deny
        WebSearchTool->>WebSearchTool: resolve api_key (search_api_key → BAIDU_SEARCH_API_KEY)
        alt api_key missing
            WebSearchTool-->>Agent: ToolError Baidu search requires an API key
        end
        WebSearchTool->>BaiduAPI: POST /v2/ai_search/web_search Authorization Bearer key
        BaiduAPI-->>WebSearchTool: "HTTP 200 {references[]}"
        alt HTTP 4xx/5xx
            WebSearchTool-->>Agent: ToolError sanitized token redacted
        end
        alt "error_code != 0 in body"
            WebSearchTool-->>Agent: ToolError Baidu search API error
        end
        WebSearchTool->>WebSearchTool: parse_baidu_results(references)
        WebSearchTool-->>Agent: ToolResult ranked WebSearchResponse
    else "provider == DuckDuckGo / Bing / Tavily / Bocha / Metaso"
        WebSearchTool->>WebSearchTool: existing provider path
    end
Loading

Comments Outside Diff (1)

  1. crates/tui/src/tools/web_search.rs, line 9-10 (link)

    P2 The module-level docstring comment on line 9 still lists only tavily/bocha/metaso and omits baidu. All other surfaces (tool description, config docs, TOOL_SURFACE.md) were updated — this one was missed.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

    Fix in Codex Fix in Claude Code Fix in Cursor

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (2): Last reviewed commit: "fix: honor search api key env override" | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds Baidu AI Search as a new web search provider option. It implements the API client, result parsing, error handling, token redaction, configuration parsing, and comprehensive unit tests, alongside updating the relevant documentation. A review comment points out a discrepancy where DEEPSEEK_SEARCH_API_KEY is documented as an environment variable override but is not actually implemented in the configuration loading logic.

Comment thread crates/tui/src/config.rs
Comment thread crates/tui/src/tools/web_search.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant