Skip to content

Latest commit

 

History

History
78 lines (58 loc) · 2.61 KB

File metadata and controls

78 lines (58 loc) · 2.61 KB

Runbook: Financial Report Processing API

Operational guidance for running, monitoring, and troubleshooting the service.

Pre-flight

  • Validate .env contains at least:
    • LAST_MODIFIED
    • Optionally GEMINI_API_KEY and other tuning vars
  • Ensure required files/dirs:
    • config/values.json
    • reports/, preprocessing/, processed/

Start/Stop

  • Local/dev start:
    make run
    # uvicorn app.main:app --host 0.0.0.0 --port 8000
  • Health: service exposes a single endpoint GET /process.

Operating procedures

  • Process current batch:
    • GET /process
  • Force reprocess ignoring cache:
    • GET /process?force=true
  • Trigger reprocess via timestamp change:
    • Update .env LAST_MODIFIED and call /process
  • Adjust throughput vs. cost/limits:
    • Lower PIPELINE_MAX_WORKERS and GEMINI_CONCURRENCY to reduce API pressure
    • Switch GEMINI_MODEL (e.g., gemini-1.5-flash) for cheaper/faster runs
  • Handle quota/rate limits:
    • Set GEMINI_FALLBACK_ON_QUOTA=true to stub quota-hit files and continue batch

Artifacts

  • Per-file outputs: processed/[report].json
  • Consolidated: processed/result.json
  • Preprocessed appendix PDFs: preprocessing/[report].pdf

Logs and Metrics

  • Structured JSON logs via structlog with X-Request-Id correlation. Client may provide X-Request-Id; otherwise generated per request.
  • Metrics (logged):
    • Counters: reports.total, reports.processed, reports.failed
    • Timers: pipeline.total, pipeline.per_file, pipeline.appendix_detection, pipeline.pdf_extract, pipeline.fields_extract, pipeline.stub_task

Failure modes (500)

  • File system errors: missing values.json, permission issues, disk full
  • AI processing errors: Gemini SDK unavailable, quota/rate limits (unless fallback), timeouts, invalid responses
  • Data processing errors: JSON parsing/invalid config

Remediation checklist:

  • Confirm config/values.json is present and valid JSON
  • Confirm .env has required keys; restart with updated env
  • Use /process?force=true after .env changes
  • Reduce concurrency and enable fallback when hitting 429s
  • Inspect processed/ for partially generated outputs

Manual preprocessing note (2-page PDFs)

  • When a report is exactly 2 pages and an appendix preprocessing issue is detected, the system logs a manual preprocessing resolution event. Maintain a JSONL log in troubleshooting/ per the template.

Testing

  • Run test suite:
    make test

Deployment tips

  • Run behind a reverse proxy (optional)
  • Persist reports/, preprocessing/, and processed/ volumes
  • Secure .env and Gemini credentials