Python implementation of the CCASS and Quotes data ingestion system, replacing the VB.NET scrapers.
This project contains:
- shared/ - Common utilities (config, database, logging, calendar)
- quotes_ingest/ - Quotes data ingestion from HKEX DQS
- ccass_ingest/ - CCASS holdings data ingestion from HKEX SDW
- ccass_api/ - FastAPI public API for CCASS data
- ccass-web/ - Next.js frontend (TypeScript)
- tools/ - Utility scripts (parity checks, schema export)
- tests/ - Test suites
# Create virtual environment
python3.11 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# For development
pip install -e ".[dev]"Copy .env.example to .env and configure:
cp .env.example .env
# Edit .env with your database credentials and settings# Run daily quotes update
quotes-ingest
# With specific date range
quotes-ingest --from-date 2024-01-01 --to-date 2024-01-31
# Dry run (no database writes)
quotes-ingest --dry-run# Run daily CCASS update
ccass-ingest
# With specific date range
ccass-ingest --from-date 2024-01-01 --to-date 2024-01-31
# Resume from specific stock code
ccass-ingest --resume-at 00001
# Rebuild history for specific issue
ccass-ingest --rebuild-hist 12345 --from-date 2024-01-01
# Dry run (no database writes)
ccass-ingest --dry-run# Run API server
ccass-api
# With custom settings
uvicorn ccass_api.main:app --host 0.0.0.0 --port 8000 --workers 4| Endpoint | Description |
|---|---|
| GET /api/v1/bigchanges | Big holding changes |
| GET /api/v1/issues/{id}/holdings | Holdings for an issue |
| GET /api/v1/participants | Participant list |
| GET /api/v1/participants/{id}/holdings | Holdings by participant |
| GET /api/v1/history | Holding history |
| GET /api/v1/concentration | Concentration analysis |
# Run all tests
pytest
# With coverage
pytest --cov=.
# Specific module
pytest tests/test_ccass/Compare Python implementation output with VB.NET:
# CCASS parity check
python -m tools.ccass_parity_check --days 20 --output /tmp/ccass-parity.json
# Quotes parity check
python -m tools.quotes_parity_check --days 30 --output /tmp/quotes-parity.jsonExternal Sources Python Ingestion Database
──────────────── ──────────────── ────────
HKEX DQS (HTML) ────────▶ quotes_ingest/ ────────▶ ccass.quotes
HKEX SDW (HTML) ────────▶ ccass_ingest/ ────────▶ ccass.holdings
HKEX Parts (JSON) ────────▶ ccass.participants
Yahoo Finance ────────▶ enigma.forexrates
│
▼
ccass_api/
│
▼
Public API
See pyproject.toml for full list. Key dependencies:
- httpx - Async HTTP client
- lxml - HTML parsing
- sqlalchemy - Database ORM
- fastapi - Web framework
- structlog - Structured logging
- tenacity - Retry logic
- typer - CLI framework