fix: batch #74, #75 — explicit timeouts on OpenAI and Gemini API calls by JFK · Pull Request #84 · JFK/voicesrt

JFK · 2026-05-18T14:29:45Z

Closes #74
Closes #75

Summary

Add explicit timeouts to every OpenAI and Gemini external API call (CLAUDE.md / .claude/rules/async-first.md compliance)
New helper call_gemini_with_timeout() consolidates the asyncio.wait_for(asyncio.to_thread(...)) pattern at 7+ call sites
New constants OPENAI_TIMEOUT_SEC=600, OPENAI_CONNECT_TIMEOUT_SEC=10, SHORT_RPC_TIMEOUT_SEC=30; GEMINI_TIMEOUT_SEC moved from gemini.py to utils.py
Extends create_openai_compatible_client() with an optional timeout kwarg, used by /api/settings/.../test_key for a 30s validation budget
Side fix: src/api/settings.py:206 was calling client.models.list() synchronously inside an async handler (event-loop blocking) — now wrapped in call_gemini_with_timeout properly

Implementation notes

All 7 Gemini call sites (catchphrase.py, quiz.py, refine.py x3, metadata.py x2, gemini.py) migrated from inline asyncio.wait_for(asyncio.to_thread(...)) boilerplate to the new call_gemini_with_timeout(fn, *args, **kwargs) helper
gemini.transcribe_with_gemini uses SHORT_RPC_TIMEOUT_SEC (30s) for the files.delete cleanup so a stuck delete cannot stall job teardown
whisper.transcribe_with_whisper now routes through create_openai_compatible_client("openai", ...) for consistency with the new timeout policy
8 genai.Client(api_key=...) constructions remain inline — intentionally not consolidated (scope discipline; no current need beyond cosmetic dedup)
New tests/test_timeout.py — 10 unit tests covering constants policy, OpenAI client timeout (default + ollama + custom override), and the Gemini helper (return value / args / timeout firing / exception propagation / default-value behavioral spy)

Pre-PR review summary

gate2 mode: advisor-only
audit: skipped
binary_gate: (none)
cso: green
qa-lead: green
cto: green
review provider: code-review

Full reviews are saved in the plugin cache:

~/.claude/cache/gh-issue-driven/74-75-batch_explicit-timeouts.gate2.md

🤖 Generated via /gh-issue-driven:ship

- OpenAI: httpx.Timeout on AsyncOpenAI / openai-compatible client (read=600s, connect=10s) - Gemini: call_gemini_with_timeout() wraps asyncio.to_thread with wait_for - utils.py: OPENAI_TIMEOUT_SEC / OPENAI_CONNECT_TIMEOUT_SEC constants aligned with GEMINI_TIMEOUT_SEC - All refine/verify/suggest/metadata/catchphrase/quiz call sites updated for both providers - tests/test_timeout.py covers constants, client init, and the new helpers Closes #74, #75. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- gemini.py: short timeout (30s) for files.delete; a stuck delete must not stall job teardown - api/settings.py: route OpenAI key-validation through create_openai_compatible_client with short timeout; wrap Gemini models.list with call_gemini_with_timeout (was blocking the event loop) - utils.py: add SHORT_RPC_TIMEOUT_SEC constant for cheap RPCs - metadata.py: drop redundant what-comment - test_timeout.py: replace inspect-based default-timeout test with monkeypatch-based behavioral test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds an explicit timeout policy for all outbound OpenAI and Gemini LLM calls to comply with the repo’s async-first rule and prevent indefinite hangs, while also consolidating repeated Gemini to_thread patterns into a shared helper.

Changes:

Introduces centralized timeout constants and a call_gemini_with_timeout() helper to enforce asyncio.wait_for(...to_thread...) across Gemini call sites.
Extends create_openai_compatible_client() to apply explicit httpx.Timeout defaults (and allow overrides) for OpenAI/OpenAI-compatible clients; migrates Whisper to use this helper.
Updates API key validation to use short, explicit timeouts and avoids event-loop blocking for Gemini model listing; adds unit tests for timeout behavior.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/test_timeout.py	Adds unit tests covering timeout constants, OpenAI client timeout defaults/overrides, and Gemini timeout helper behavior.
src/services/utils.py	Defines timeout policy constants, extends OpenAI-compatible client creation with explicit timeouts, and adds `call_gemini_with_timeout()`.
src/services/whisper.py	Routes Whisper transcription client creation through the timeout-configured OpenAI-compatible client helper.
src/services/refine.py	Switches Gemini calls from raw `to_thread` to `call_gemini_with_timeout()` for consistent timeout enforcement.
src/services/quiz.py	Switches Gemini calls to `call_gemini_with_timeout()` for consistent timeout enforcement.
src/services/metadata.py	Switches Gemini calls to `call_gemini_with_timeout()` and hardens `response.text` handling.
src/services/gemini.py	Removes inline timeout boilerplate in favor of `call_gemini_with_timeout()` and uses a shorter timeout for cleanup deletes.
src/services/catchphrase.py	Switches Gemini calls to `call_gemini_with_timeout()` for consistent timeout enforcement.
src/api/settings.py	Applies short explicit timeouts during key validation and avoids blocking the event loop for Gemini model listing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

JFK · 2026-05-18T14:41:29Z

+async def call_gemini_with_timeout(
+    fn: Callable[..., T],
+    *args: Any,
+    timeout: float = GEMINI_TIMEOUT_SEC,
+    **kwargs: Any,
+) -> T:
+    """Run a synchronous Gemini SDK call in a worker thread with an asyncio timeout.
+
+    The Gemini Python SDK is blocking, so the project wraps every call in
+    ``asyncio.to_thread``. Without ``asyncio.wait_for`` the thread can wait
+    indefinitely if the API hangs (.claude/rules/async-first.md). Use this
+    helper for every Gemini call site so the timeout policy stays uniform.
+    """
+    return await asyncio.wait_for(
+        asyncio.to_thread(fn, *args, **kwargs),
+        timeout=timeout,
+    )


Thanks — the analysis is correct on both points. Addressed and deferred respectively in 10b820f:

Adopted: timeout logging. Wrapped asyncio.wait_for in try/except TimeoutError and added logger.warning(...) with the timeout budget and fn.__qualname__ so repeated hangs are surface-able from server logs. Re-raise unchanged so existing caller try/except contracts (e.g. gemini.transcribe_with_gemini's best-effort files.delete) keep working.

Deferred: dedicated bounded executor. The thread-leak risk is real but bounded by this app's operational shape:

VoiceSRT is a single-user-ish desktop tool, not a throughput service. Typical concurrency is 1 job at a time; 5–10 is the realistic upper bound.

The default ThreadPoolExecutor has min(32, os.cpu_count() + 4) slots — ~36 on a typical machine. Exhaustion needs ≥36 concurrent hung Gemini calls.

A dedicated executor adds lifecycle (create/shutdown), config (pool size), and error paths for a hypothetical-but-not-likely scale.

Per the repo's scope-discipline (CLAUDE.md / system-level guidance: "don't introduce abstractions beyond what the task requires"), we'll revisit if/when concurrent job runners land or telemetry shows actual exhaustion. The warning log added here is the operational tripwire that would surface that signal.

Reference: src/services/utils.py:193-208 @ 10b820f

JFK · 2026-05-18T14:35:17Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

Address Copilot PR #84 feedback on src/services/utils.py:196 — the suggestion to add explicit timeout logging is adopted. The companion suggestion to introduce a dedicated bounded executor is deferred (see PR thread): the default thread pool has ~36 slots and exhaustion would require dozens of concurrent hung Gemini calls, which the single-user operational shape of this app does not realistically reach. - Wrap asyncio.wait_for in try/except TimeoutError - logger.warning includes timeout budget + fn qualname so repeated hangs are diagnosable from server logs - Re-raise unchanged so existing caller try/except contracts are preserved Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

JFK · 2026-05-18T14:53:15Z

+    # Best-effort cleanup; a stuck delete must not stall job teardown.
    try:
-        await asyncio.to_thread(client.files.delete, name=uploaded.name)
+        await call_gemini_with_timeout(client.files.delete, name=uploaded.name, timeout=SHORT_RPC_TIMEOUT_SEC)
    except Exception:
        logger.warning("Failed to delete uploaded file: %s", uploaded.name)


Good catch — fixed in 1ccbb12. Restructured into try/finally so the delete runs even on generate_content / parse_json_response failure. CLAUDE.md mandates this exact pattern ("Temp files: Always clean up in finally blocks").

The prompt build + generate-content path moved to a helper _transcribe_uploaded() so the outer function holds a single, narrow finally around just the cleanup — keeps the upload→use→delete lifecycle visible at the function-top level.

src/services/gemini.py:28-40 @ 1ccbb12

JFK · 2026-05-18T14:53:17Z

+from collections.abc import Callable
+from typing import Any, TypeVar
+
+import httpx


Fair point — fixed in 1ccbb12. Added httpx>=0.27.0 to [project].dependencies. Production no longer depends on openai's transitive httpx to satisfy our module-level import.

pyproject.toml:9 @ 1ccbb12

JFK · 2026-05-18T14:53:19Z

+    The Gemini Python SDK is blocking, so the project wraps every call in
+    ``asyncio.to_thread``. Without ``asyncio.wait_for`` the thread can wait
+    indefinitely if the API hangs (.claude/rules/async-first.md). Use this
+    helper for every Gemini call site so the timeout policy stays uniform.


Correct — fixed in 1ccbb12. Migrated model_validator._check_gemini's await asyncio.to_thread(_retrieve) to await call_gemini_with_timeout(_retrieve, timeout=SHORT_RPC_TIMEOUT_SEC). Validation is a cheap one-shot lookup so it gets the 30s short budget rather than the full 600s.

Now issue #75's "all Gemini calls have explicit asyncio.wait_for timeouts" goal is fully met across src/services/.

src/services/model_validator.py:124-130 @ 1ccbb12

Three findings from Copilot: 1. src/services/model_validator.py:125 — bare `asyncio.to_thread(_retrieve)` for Gemini model validation had no timeout, contradicting the helper docstring claim of "every Gemini call site". Migrate to `call_gemini_with_timeout(_retrieve, timeout=SHORT_RPC_TIMEOUT_SEC)` so issue #75's "all Gemini calls timeout-wrapped" goal is fully met. 2. src/services/utils.py imports httpx at module level but pyproject.toml only listed it under [project.optional-dependencies].dev — runtime relied on openai's transitive httpx dep. Promote httpx>=0.27.0 to [project].dependencies so production installs cannot break from an openai dependency-graph change. 3. src/services/gemini.py:transcribe_with_gemini — file cleanup ran only on the success path; a raise from generate_content / parse_json_response leaked the uploaded file on Gemini's side. Restructure into try/finally per CLAUDE.md "Temp files: Always clean up in `finally` blocks" — split the prompt-build + generate-content into _transcribe_uploaded() so the outer function holds a single, narrow finally block around it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

JFK · 2026-05-18T15:00:35Z

 dependencies = [
    "fastapi>=0.115.0,<1.0.0",
+    "httpx>=0.27.0",
    "uvicorn[standard]>=0.34.0,<1.0.0",
    "jinja2>=3.1.0,<4.0.0",


Both points adopted in 39cb4bc:

Upper bound: httpx>=0.27.0,<1.0.0 — matches the project-wide convention (every other [project].dependencies entry constrains the major version).

Dedup: removed httpx>=0.27.0 from [optional-dependencies].dev since runtime deps now cover dev installs too. Prevents drift between the two lists.

pyproject.toml:7-22 @ 39cb4bc

Address Copilot PR #84 round-3 feedback on pyproject.toml: - Add `<1.0.0` upper bound on `httpx>=0.27.0` to match the project-wide convention (every other [project].dependencies entry constrains the major version) — guards against an httpx 1.x release silently breaking prod installs. - Remove the now-redundant `httpx>=0.27.0` from [optional-dependencies].dev since runtime deps cover dev installs too. Prevents version drift between the two lists. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/services/model_validator.py:136

call_gemini_with_timeout() emits a warning log when it times out; this handler also logs a warning for all exceptions, so a timeout will produce duplicate warnings for the same event. Consider special-casing TimeoutError/asyncio.TimeoutError here (e.g., skip logging or log at debug), while keeping the warning for other transport errors you still want surfaced.

    try:
        await call_gemini_with_timeout(_retrieve, timeout=SHORT_RPC_TIMEOUT_SEC)
    except Exception as e:
        # google-genai does not expose structured 404 — fall back to string match.
        msg = str(e)
        if "404" in msg or "NOT_FOUND" in msg or "not found" in msg.lower():
            raise ModelNotAvailableError("gemini", model, msg) from e
        logger.warning("Gemini model validation skipped (transport): %s", e)

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

JFK and others added 2 commits May 18, 2026 23:12

Copilot AI review requested due to automatic review settings May 18, 2026 14:29

Copilot started reviewing on behalf of JFK May 18, 2026 14:30 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

JFK requested a review from Copilot May 18, 2026 14:43

Copilot started reviewing on behalf of JFK May 18, 2026 14:44 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

JFK requested a review from Copilot May 18, 2026 14:54

Copilot started reviewing on behalf of JFK May 18, 2026 14:55 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

JFK requested a review from Copilot May 18, 2026 15:04

Copilot started reviewing on behalf of JFK May 18, 2026 15:04 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

Comment thread src/services/gemini.py

Potential fix for pull request finding

5044ab8

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

JFK merged commit 57aa7ab into main May 18, 2026
2 checks passed

JFK deleted the 74-75-batch/explicit-timeouts branch May 18, 2026 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: batch #74, #75 — explicit timeouts on OpenAI and Gemini API calls#84

fix: batch #74, #75 — explicit timeouts on OpenAI and Gemini API calls#84
JFK merged 6 commits into
mainfrom
74-75-batch/explicit-timeouts

JFK commented May 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

JFK May 18, 2026 •

edited

Loading

Uh oh!

JFK commented May 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

JFK May 18, 2026

Uh oh!

JFK May 18, 2026

Uh oh!

JFK May 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

JFK May 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JFK commented May 18, 2026

Summary

Implementation notes

Pre-PR review summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

JFK May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JFK commented May 18, 2026

Code review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

JFK May 18, 2026

Choose a reason for hiding this comment

Uh oh!

JFK May 18, 2026

Choose a reason for hiding this comment

Uh oh!

JFK May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

JFK May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JFK May 18, 2026 •

edited

Loading