Skip to content

refactor(api): migrate to httpx/Hishel, simplify modules, and extend client config#7

Merged
AN0DA merged 27 commits into
mainfrom
api-layer-cleanup
Mar 28, 2026
Merged

refactor(api): migrate to httpx/Hishel, simplify modules, and extend client config#7
AN0DA merged 27 commits into
mainfrom
api-layer-cleanup

Conversation

@AN0DA

@AN0DA AN0DA commented Mar 22, 2026

Copy link
Copy Markdown
Owner

Summary

This branch modernizes the pyBDL API layer and public client: HTTP traffic moves from requests and requests-cache to httpx with optional Hishel caching, configuration grows structured retry, cache-backend, and rate-limit options, and API modules are flattened behind shared helpers and clearer return typing. The top-level client gains context-manager usage and typed namespaces; the access layer adds a DataFrame enrichment decorator and tighter typing on data paths. Documentation (README, Sphinx, examples notebook), root changelog, and tests are updated throughout, including replacing responses with respx and adding cache lifecycle and enrichment coverage. Release automation and versioning were adjusted (workflow triggers, 0.1.0, .worktrees in .gitignore).

Purpose & Context

The goal is a maintainable HTTP stack that aligns sync and async usage, gives callers explicit control over caching and backoff (including HTTP 429), and keeps rate limiting accurate when work is cancelled or retried—hence acquire returning a timestamp and a release path for quota refund. API modules had duplicated parameter wiring; consolidating that reduces drift and makes endpoint modules easier to review. The release workflow change removes push and pull_request triggers so releases follow the intended manual or tag-driven process.

Changes Made

  • Client & HTTP: Migrate to httpx + Hishel; BDLConfig supports cache backend selection, retry settings, and rate-limit-related options; BaseAPIClient uses per-instance limiters and raises BDLHTTPError / BDLResponseError where appropriate.
  • API surface: Large refactors in pybdl/api/ (notably data, subjects, units, and shared patterns across aggregates, attributes, levels, measures, variables, years, version) with type aliases and casts for data API returns.
  • Rate limiting & quota cache: Acquire returns a timestamp; sync/async limiters support release; quota cache uses resolve_cache_file_path and remove_last_if_matches; shared resolve_cache_file_path in pybdl/utils/cache.py.
  • Access layer: New enrichment module and decorator; updates to base, data, aggregates, units, variables.
  • Dependencies: platformdirs, optional viz extra; lockfile refresh; dev dependency swap to respx.
  • Docs & repo hygiene: Expanded Sphinx sections, notebook and README updates, CHANGELOG.md; .gitignore adds .worktrees.

Testing

This repository typically runs make all (Ruff format/check, Bandit, Mypy, pytest with coverage) and make docs for a strict Sphinx HTML build. This description was generated from git history only; the full suite was not re-run in that session. Reviewers should rely on CI and, locally, uv run pytest or make test plus make docs if validating before merge.

Review Focus Areas

  • Correctness of Hishel cache configuration and cache file resolution across platforms.
  • Semantics of rate-limit acquire/release and interaction with retries and 429 handling.
  • Any downstream code that assumed requests-specific types or prior config field names.

Dependencies & Side Effects

  • Runtime: httpx, hishel (async extra), platformdirs; visualization stack is optional under viz.
  • Dev/tests: respx replaces responses for HTTP mocking.

Deployment Notes

  • Confirm .github/workflows/release.yml behavior after removing push and pull_request triggers matches the team’s release policy.

AN0DA added 22 commits March 22, 2026 13:57
- Add request_retries, retry_backoff_factor, max_retry_delay, retry_status_codes
- Refactor __init__ with explicit env var handling
- Add quota_cache_file and use_global_cache support
- Add tests for explicit override and retry env vars
- BDLHTTPError for HTTP failures with status_code, response_body, url
- BDLResponseError for invalid/unexpected payloads
- Used by BaseAPIClient for structured error handling
- Use per-instance rate limiters instead of global shared limiters
- Extract _build_proxy_url and _build_default_headers helpers
- Add httpx.AsyncClient for async requests with proxy support
- Add close() and aclose() for resource cleanup
- Support quota_cache_file and use_global_cache via PersistentQuotaCache
- Extract _list_params, _search_params static helpers where applicable
- Trim docstrings for brevity
- Reduce duplication across aggregates, attributes, levels, measures,
  subjects, units, variables, years, version
- Consolidate parameter building and request logic
- Reduce code duplication in sync/async paths
- Align with other API module patterns
- Add enrichment.py with with_enrichment decorator and EnrichmentSpec
- Support levels, measures, subjects, units, attributes, aggregates lookups
- Add _enrichment_cache to BaseAccess for deduplication
- Refactor BaseAccess._get_calling_function_name (use sys._getframe)
- Apply enrichment to aggregates, units, variables, data access methods
- Add variable_ids param support and get_data_by_variable_with_metadata
- Replace SimpleNamespace with typed APINamespace dataclass
- Add close() and aclose() for resource cleanup
- Add __enter__/__exit__ and __aenter__/__aexit__ for sync/async context managers
- Move matplotlib, numpy, seaborn to optional [viz] extra
- Add platformdirs for quota cache paths
- Remove dataclasses (stdlib in 3.11+)
- Add installation, quick start, configuration, API layers sections
- Document optional [viz] extra and env vars
- Update examples.ipynb for enrichment and context manager usage
- Add test_enrichment.py for with_enrichment decorator
- Add test_enrichment_integration.py for access layer
- Add test_enrichment_e2e.py for live API
- Add enrich_levels tests to aggregates, data, units, variables integration
- Adapt test_api_client for per-instance limiters, close methods
- Adapt test_api_data for simplified data API
- Add test_client tests for context manager and aclose
- Support custom_file to use explicit path instead of default cache dir
- Create parent dirs when custom_file is used
- Used by PersistentQuotaCache for flexible cache location
- cache_backend: 'memory' | 'file' | None (replaces use_cache boolean)
- raise_on_rate_limit: raise RateLimitError vs wait when quota exhausted
- http_429_max_retries, http_429_max_delay: separate 429 retry policy
- Add QuotaMap type alias for quota configuration
…tches

- Use pybdl.utils.cache.resolve_cache_file_path for cache location
- Add remove_last_if_matches for atomic slot refund (used by rate limiter release)
…refund

- acquire() now returns float | None (monotonic time) for later release
- Add release(recorded_at) to refund a slot (HTTP errors, retries)
- Add raise_on_limit param passed from config
- Empty quotas: acquire returns None immediately
…ith respx

- hishel[async] for HTTP caching (memory/file sqlite backends)
- respx for httpx-compatible request mocking in tests
- Remove requests, requests-cache, responses, types-requests
- Add real_rate_limiting pytest marker
…mit release

- Replace requests/requests-cache with hishel SyncCacheClient/AsyncCacheClient
- Support cache_backend: memory (sqlite :memory:), file (sqlite db), or disabled
- Pass raise_on_limit to rate limiters from config
- Call rate limiter release() on HTTP errors for quota refund
- Parse Retry-After header (seconds or HTTP-date) for 429/5xx retries
- Use httpx exclusively (remove Response | httpx.Response unions)
- Add _DataJsonPayload, _DataWithMetadata, _DataCollectionResult
- Use cast for get_*_with_metadata return type narrowing
- Simplify variable_ids resolution (inline ternary)
- Simplify _normalize_variable_ids ternary in data.py
- Add return type annotations to with_enrichment decorator
- Reorder imports in enrichment.py
- Add CHANGELOG.md (Keep a Changelog format)
- Add docs/changelog.md for Sphinx changelog page
- Expand access_layer, config, main_client, rate_limiting
- Update index and appendix
- Add changelog to conf.py
- Update examples.ipynb for new features
- README: document cache_backend, rate limit env vars
- Replace responses with respx in conftest (paginated_mock uses MockRouter)
- Add real_rate_limiting marker skip in api conftest
- Add test_api_client_cache.py for hishel cache backend
- Add test_rate_limiter_lifecycle.py for acquire/release
- Update API tests for hishel/httpx
- Update config tests for cache_backend, raise_on_rate_limit, HTTP 429
- Update client, enrichment, data access tests
- Add samples_raw_subjects.json entries
@github-actions

Copy link
Copy Markdown
Contributor

Test Results (Python 3.11)

538 tests  +112   527 ✅ +110   7s ⏱️ +2s
  1 suites ±  0    11 💤 +  2 
  1 files   ±  0     0 ❌ ±  0 

Results for commit 7982371. ± Comparison against base commit c804b19.

@github-actions

Copy link
Copy Markdown
Contributor

Test Results (Python 3.13)

538 tests  +112   527 ✅ +110   7s ⏱️ +2s
  1 suites ±  0    11 💤 +  2 
  1 files   ±  0     0 ❌ ±  0 

Results for commit 7982371. ± Comparison against base commit c804b19.

@github-actions

Copy link
Copy Markdown
Contributor

Test Results (Python 3.12)

538 tests  +112   527 ✅ +110   7s ⏱️ +2s
  1 suites ±  0    11 💤 +  2 
  1 files   ±  0     0 ❌ ±  0 

Results for commit 7982371. ± Comparison against base commit c804b19.

AN0DA added 3 commits March 28, 2026 21:14
…rkflow

- Simplified the release workflow by removing the push and pull_request triggers, retaining only the workflow_dispatch event.
- Adjusted the example formatting in the BaseAccess class to use a block-style example for clarity.
- Bump project version in pyproject.toml from 0.0.1 to 0.1.0.
- Include .worktrees directory in .gitignore to prevent tracking of worktree files.
@AN0DA AN0DA changed the title Api layer cleanup refactor(api): migrate to httpx/Hishel, simplify modules, and extend client config Mar 28, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Test Results

538 tests   519 ✅  7s ⏱️
  1 suites   11 💤
  1 files      8 ❌

For more details on these failures, see this check.

Results for commit 78259d7.

@github-actions

Copy link
Copy Markdown
Contributor

Test Results

538 tests   519 ✅  7s ⏱️
  1 suites   11 💤
  1 files      8 ❌

For more details on these failures, see this check.

Results for commit 78259d7.

@github-actions

github-actions Bot commented Mar 28, 2026

Copy link
Copy Markdown
Contributor

Test Results

538 tests   527 ✅  7s ⏱️
  1 suites   11 💤
  1 files      0 ❌

Results for commit 654b0df.

♻️ This comment has been updated with latest results.

AN0DA added 2 commits March 28, 2026 21:25
- Streamlined the type conversion process for DataFrame columns in the BaseAccess class.
- Enhanced checks for object and string dtypes before attempting conversion to numeric or boolean types.
- Improved handling of non-null values to ensure accurate type casting.
- Refactored the numeric conversion checks for DataFrame columns to improve readability and efficiency.
- Consolidated conditional statements for dtype checks and non-null value handling.
- Ensured accurate type casting for integer-like values in the DataFrame.
@AN0DA AN0DA merged commit 74b2942 into main Mar 28, 2026
11 checks passed
@AN0DA AN0DA deleted the api-layer-cleanup branch March 28, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant