Skip to content

feat: add Tavily as optional search provider for GeeksforGeeks in SmartWebSearcher#2

Open
mani2001 wants to merge 2 commits into
gonicolas12:mainfrom
mani2001:feat/tavily-migration/smart-code-searcher
Open

feat: add Tavily as optional search provider for GeeksforGeeks in SmartWebSearcher#2
mani2001 wants to merge 2 commits into
gonicolas12:mainfrom
mani2001:feat/tavily-migration/smart-code-searcher

Conversation

@mani2001

Copy link
Copy Markdown

Summary

  • Added Tavily as a configurable alternative to Google HTML scraping for GeeksforGeeks code searches in SmartWebSearcher
  • When TAVILY_API_KEY is set (via config.yaml or environment variable), Tavily's search() API with include_domains=["geeksforgeeks.org"] is used instead of fragile Google scraping
  • Existing Google scraping path is preserved as fallback when Tavily is not configured or when a Tavily request fails

Files changed

  • models/smart_web_searcher.py — Added Tavily client initialization, _search_geeksforgeeks_tavily() method, and routing logic to prefer Tavily when available
  • requirements.txt — Added tavily-python>=0.5.0
  • config.yaml — Added tavily.api_key configuration entry

Dependency changes

  • Added tavily-python>=0.5.0 to requirements.txt

Environment variable changes

  • Added TAVILY_API_KEY — read from config.yaml (tavily.api_key) or directly from environment

Notes for reviewers

  • This is an additive change: the existing Google scraping path is fully preserved and used when Tavily is not configured
  • GitHub and Stack Overflow searches are unchanged (they use official APIs)
  • The Tavily method uses search_depth="advanced" for highest relevance and include_domains to scope results to geeksforgeeks.org
  • run_in_executor is used to call the synchronous TavilyClient.search() from async context without blocking the event loop

🤖 Generated with Claude Code

Automated Review

  • Passed after 2 attempt(s)
  • Final review: All three fixes from attempt 2 are correctly applied and verifiable in the diff. (1) include_raw_content=True is present in the Tavily search() call, enabling the BeautifulSoup code-extraction path. (2) import os is placed alphabetically within the stdlib block (between json and re), satisfying PEP 8. (3) No GITHUB_TOKEN env-var resolution was added — the change stays in scope. The additive migration preserves the original _search_geeksforgeeks scraper as a fallback, tavily-python>=0.5.0 is added to requirements.txt, and TAVILY_API_KEY is documented in config.yaml. Tavily SDK call parameters (search_depth, include_domains, include_raw_content, max_results) are all valid. No broken imports, dead code, or unrelated changes detected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant