Skip to content

feat(db): configurable connection pool tuning for MySQL / PostgreSQL#72

Merged
shirshanka merged 3 commits into
datahub-project:mainfrom
pushiwuhua7:feat/db-pool-config
Jun 21, 2026
Merged

feat(db): configurable connection pool tuning for MySQL / PostgreSQL#72
shirshanka merged 3 commits into
datahub-project:mainfrom
pushiwuhua7:feat/db-pool-config

Conversation

@pushiwuhua7

Copy link
Copy Markdown
Contributor

Why

Without pool_pre_ping=True, asyncpg leaves connections that were
interrupted by task cancellation (CancelledError) in a half-broken
state inside the SQLAlchemy pool. The next checkout of such a
connection hangs indefinitely, which under concurrent load presents
as the whole app stalling — even though no obvious error surfaces.

The default AsyncAdaptedQueuePool (pool_size=5, max_overflow=10,
pool_timeout=30s) is also too small for production deployments and
provides no recycle behavior to survive managed-PG idle timeouts.

What

Expose the standard SQLAlchemy pool knobs via env vars, applied to
non-SQLite engines only:

Env var Default Purpose
DB_POOL_SIZE 10 Persistent pool size
DB_MAX_OVERFLOW 20 Burst capacity
DB_POOL_PRE_PING true Detect broken connections on checkout (asyncpg critical)
DB_POOL_TIMEOUT 10 Fail-fast under load (SQLA default 30s "hangs")

SQLite path is unchanged.

Test

  • Unit: settings load with defaults, env-var override verified
  • Manual: previously-reproducible asyncpg hang under concurrent SSE
    agent execution no longer occurs after enabling pool_pre_ping

pushiwuhua7 and others added 3 commits June 4, 2026 18:45
Without pool_pre_ping=True, asyncpg leaves cancelled-task connections
in a broken state inside the pool, causing the next checkout to hang
indefinitely under concurrent load. Expose the standard SQLAlchemy
pool knobs (size / max_overflow / recycle / pre_ping / timeout) via
env vars with sensible defaults, applied to non-SQLite engines only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backs the unit-test claim in the PR description:
- Settings defaults and env-var override (int + bool coercion)
- _get_engine omits pool kwargs for SQLite, applies all five for
  MySQL / PostgreSQL

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed connections

pool_pre_ping issues SELECT 1 on checkout, but on an asyncpg connection
wedged by task cancellation that ping can itself hang forever without a
per-statement timeout. Add DB_COMMAND_TIMEOUT (default 30s, 0 disables),
applied via connect_args for asyncpg only — SQLite and MySQL are untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@shirshanka shirshanka left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Made a few small tweaks - before approving.

@shirshanka shirshanka merged commit 187d0e2 into datahub-project:main Jun 21, 2026
7 checks passed

@shirshanka shirshanka left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving. Core change is correct and well-scoped — pydantic coerces the int/bool pool env vars properly, the SQLite path is preserved unchanged, and AsyncAdaptedQueuePool (default for async MySQL/PostgreSQL) accepts all five kwargs.

Two follow-up commits were pushed to this branch (maintainer edits):

  • Unit tests for the pool settings + _get_engine wiring (SQLite omits pool kwargs; MySQL/Postgres apply them).
  • DB_COMMAND_TIMEOUT (asyncpg-only, default 30s, 0 disables) applied via connect_args, so pool_pre_ping's SELECT 1 on a connection wedged by task cancellation fails fast instead of hanging forever — making the hang fix robust rather than probabilistic.

ruff check/format + mypy clean; 17 unit tests pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants