Skip to content

ShrishDhuria/macro_regime

Repository files navigation

Macro Regime & Cross-Asset Risk Intelligence Platform

tests

A regime-aware macro-financial research framework for European markets. Built end-to-end in Python across data infrastructure, regime detection, risk analysis, tactical allocation backtesting, stress testing, and an interactive dashboard.


Overview

This platform detects macroeconomic regimes across European cross-asset markets, computes regime-conditional risk metrics on a multi-asset portfolio, and tests tactical allocation strategies under realistic walk-forward backtest conditions. It is structured around six phases — data spine, feature engineering, regime detection, risk engine, tactical allocation, stress testing + deliverables — each producing concrete artifacts.

The project's defining characteristic is its commitment to walk-forward methodology, lookahead-bias enforcement, and honest reporting of both positive and negative findings. The backtest results contain a deliberately documented negative finding: at weekly horizon, regime-based and ML-based tactical timing strategies underperform equal-weight, while a risk-parity baseline reduces volatility by ~30% with similar Sharpe. The institutional payoff of the regime detection layer is in risk monitoring, not return timing.


Key findings

Headline results — Sharpe by strategy and regime-conditional Expected Shortfall

Left: walk-forward Sharpe by strategy (2011–2026, net of 5bp costs) — every tactical-timing strategy lands below the equal-weight benchmark (dashed line). Right: 99% historical Expected Shortfall by regime — the institutionally useful result. Re-run python scripts/build_strategies.py for the full equity-curve and drawdown plots, or python make_figures.py to regenerate this summary.

Regime detection works. A 3-state Gaussian Hidden Markov Model on five macro features cleanly identifies every major European macro stress event 2008-2026 — GFC, sovereign crisis, 2015 Bund tantrum, Q1 2016 China/oil, COVID, 2022 inflation/Russia — plus the leading-into-Lehman period 2007-08. Cross-tab against a 2-state baseline shows the third state earns its keep by surfacing transition periods (wider IT-DE spreads, calm equity vol) that pure equity-vol classifiers miss.

Regime-conditional risk metrics work. Historical Expected Shortfall at 99% confidence is 6.9% in calm regimes vs 17.5% in crisis — a 2.5x scaling that an unconditional VaR would mis-price. This is the institutionally credible application of the regime layer.

99% Cornish-Fisher VaR is 2x the parametric Gaussian VaR. Portfolio skew -0.80 and excess kurtosis 4.4 (unconditionally) mean Gaussian VaR understates 99% tail risk by more than half.

Tactical allocation underperformed equal-weight, and we explain why. Six strategies were walk-forward backtested with 5bp transaction costs and ESTR-financed leverage, scored on a common 2011-2026 window:

Strategy Sharpe Max DD Ann turnover
Equal Weight 0.29 -44.7% 0%
ERC (risk parity) 0.26 -36.7% 16%
VT-ERC 0.21 -40.2% 24%
Regime-Tilt 0.17 -43.0% 30%
Vol-Forecast (LightGBM) 0.21 -40.2% 37%
DD-Predict (LightGBM classifier) 0.16 -40.2% 206%

(Figures above are the original full-period run. The backtest now (a) classifies regimes out-of-sample with an annually-refit walk-forward HMM, (b) charges the spliced ESTR rate on leverage instead of assuming it free, and (c) scores all six strategies on the common post-2011 window so the timing strategies are not credited with the pre-signal VT-ERC path through the 2008 GFC. These shift the levels but not the ordering or the conclusion; re-run build_strategies.py to regenerate.)

The regime-tilted strategy failed because the HMM crisis state captures both crash and recovery weeks; crisis-regime annualized return is +20.8% for EW, so cutting equity in crisis sells low and buys high. The drawdown classifier achieved AUC 0.51 — essentially no predictive signal at weekly horizon. Conclusion: regime detection is a risk-monitoring tool, not a return-timing signal.


Architecture

macro_regime/
├── config/                    # ticker registry
├── data/
│   ├── ingestion/             # yfinance, FRED, ECB fetchers
│   ├── alignment.py           # weekly Friday-close harmonization
│   └── storage.py             # parquet I/O
├── features/                  # Phase 2 — 63-feature library
│   ├── returns.py
│   ├── volatility.py
│   ├── spreads.py
│   ├── correlations.py
│   ├── momentum.py
│   ├── macro_lag.py           # publication-lag enforcement
│   ├── freshness.py           # staleness check
│   └── short_rate.py          # EONIA/ESTR splice
├── regimes/                   # Phase 3 — HMM core
│   ├── data_prep.py
│   ├── hmm_model.py           # multi-seed Gaussian HMM (full-sample)
│   ├── walk_forward_hmm.py    # annually-refit, out-of-sample HMM for tactical use
│   └── diagnostics.py
├── risk/                      # Phase 4 — risk engine
│   ├── portfolio.py
│   ├── var.py                 # historical, parametric, Cornish-Fisher
│   ├── expected_shortfall.py
│   ├── drawdown.py
│   ├── beta.py
│   ├── regime_conditional.py
│   └── excel_export.py        # parallel openpyxl workbook
├── forecast/                  # Phase 5 / 5b — LightGBM models
│   ├── vol_model.py           # walk-forward vol forecast
│   └── drawdown_model.py      # walk-forward classifier
├── portfolio/                 # Phase 5 — allocation rules
│   ├── risk_parity.py         # ERC solver + vol targeting
│   └── weights.py             # tilting functions
├── backtest/                  # Phase 5 — walk-forward engine
│   ├── engine.py
│   └── metrics.py
├── stress/                    # Phase 6 — stress engine
│   ├── scenarios.py
│   └── transmission.py        # empirical-beta-based
├── dashboard/
│   └── app.py                 # Streamlit 4-tab dashboard
├── scripts/                   # orchestrators (one per phase)
│   ├── build_panel.py
│   ├── build_features.py
│   ├── build_regimes.py
│   ├── build_risk.py
│   ├── build_strategies.py
│   ├── build_stress.py
│   └── build_deck.py
├── reports/                   # generated artifacts (PNG, XLSX, PPTX)
├── data_store/                # gitignored cache (raw + processed parquet)
├── requirements.txt
├── sources.md                 # data provenance audit trail
└── README.md

Setup

Python 3.12+ recommended (3.10+ should work; the codebase avoids deprecated APIs).

python -m venv .venv
source .venv/bin/activate          # Windows Git Bash: source .venv/Scripts/activate
pip install -r requirements.txt

Running the platform

Each phase has one orchestrator. Run in order; later phases depend on earlier outputs.

py scripts/build_panel.py        # Phase 1 — fetch and align the data spine
py scripts/build_features.py     # Phase 2 — build the 63-feature library
py scripts/build_regimes.py      # Phase 3 — fit 2- and 3-state HMMs with diagnostics
py scripts/build_risk.py         # Phase 4 — risk metrics + Excel workbook
py scripts/build_strategies.py   # Phase 5/5b — six strategies, walk-forward backtest
py scripts/build_stress.py       # Phase 6 — stress testing
py scripts/build_deck.py         # Phase 6 — generate the methodology PowerPoint
streamlit run dashboard/app.py   # Phase 6 — launch the interactive dashboard

Total fresh-install runtime: approximately 5-7 minutes on a typical laptop. (Use python instead of py on macOS/Linux.)


Phase-by-phase outputs

Phase Deliverable Output
1 Data spine data_store/panel/master_panel.parquet (~1,114 weekly rows, 2005-2026)
2 Feature library data_store/panel/master_features.parquet (63 features)
3 HMM regime detection regime_overlay_3state.png, regime_probabilities_3state.png, persisted viterbi/probabilities/transitions/emissions
4 Risk engine risk_workbook.xlsx (4 sheets with named ranges), regime-conditional metrics
5/5b Tactical backtest strategy_comparison.png, strategy_drawdowns.png, drawdown_predictions.png, per-strategy weights + returns
6 Stress + dashboard + deck stress_results.parquet, Streamlit dashboard, macro_regime_methodology_deck.pptx (16 slides)

Methodology highlights

Lookahead-bias enforcement. Macro releases (HICP) are forward-shifted by their publication lag before entering any model. LightGBM training at refit date d uses only rows whose forecast target was observable by d.

Walk-forward. LightGBM models refit every 13 weeks on expanding window. Covariance for ERC recomputed at each weekly rebalance on trailing 156-week window. The regime HMM used for tactical tilting is refit annually on an expanding window, standardized with training-window statistics only, and each week is classified by Viterbi-decoding the expanding prefix — so no future observation informs a week's regime label.

Funding cost. Vol-targeted leverage is financed at the spliced ESTR/EONIA overnight rate rather than assumed free; idle cash earns the same rate.

Short-rate splice. ESTR began October 2019; EONIA (its predecessor) is spliced in for pre-2019 history with the 8.5bp adjustment per ECB Recommendation 2019/C 295/02, producing a level-consistent overnight rate back to 2005.

HMM stability. Best-of-5 random-seed initialization for the 3-state model defends against local optima.

Realistic frictions. 5bp one-way transaction cost charged on actual turnover.


Acknowledged limitations

  1. HMM regime labels for tactical allocation are now fitted walk-forward. The tilt consumes an annually-refit, expanding-window HMM (regimes/walk_forward_hmm.py) with per-window standardization and prefix-decoded, no-lookahead inference. Worth noting for interpretation: the earlier full-sample fit handed the regime-tilt strategy a lookahead advantage and it still lost to equal-weight — so removing that advantage can only reinforce the null finding, not overturn it. The full-sample labels are retained solely for the in-sample regime-conditional risk table, where they are a descriptive, not predictive, statistic.
  2. Stress transmission uses linear empirical betas. Tail co-movement is non-linear in practice; a more sophisticated version would use conditional copulas or a regime-switching factor model.
  3. Weekly frequency limits ML training data. The drawdown classifier achieved AUC 0.51, partly because weekly horizon yields few positive examples per refit window.
  4. Italian and French 10Y yields can run ~90 days stale due to FRED's OECD update cadence. Flagged at build time by the freshness check.
  5. VSTOXX (V2X) unavailable on Yahoo Finance. Replaced by SX5E rolling realized volatility.

Tech stack

  • Modeling: hmmlearn (HMM), LightGBM (forecasting), SciPy (optimization)
  • Data: pandas, numpy, pyarrow (parquet), requests, yfinance
  • Visualization: matplotlib, plotly (Streamlit)
  • Deliverables: openpyxl (Excel), python-pptx (PowerPoint), Streamlit (dashboard)
  • All code: Python 3.12, modular architecture

Author

Shrish Dhuria · ESSEC Master in Finance · May 2026

Built as a research framework demonstrating institutional methodology in regime detection, multi-asset risk analysis, and walk-forward tactical allocation. The negative finding on tactical regime-tilting is the deliberate intellectual contribution — research that documents what doesn't work, and why, is more useful than work that papers over null results.


Limitations

Scope conditions on the headline finding:

  • The null result is conditional on the design. It holds for this feature set, a 3-state Gaussian HMM, weekly horizon, and a European cross-asset universe; a different model, horizon, or market could differ — the contribution is the rigorous demonstration, not a universal law.
  • HMM assumptions are strong. Gaussian emissions and a first-order Markov structure miss fat tails and longer-memory regime persistence.
  • Limited regime switches in the OOS sample. Honest walk-forward leaves only a handful of true regime transitions post-2011, so statistical power on the timing claim is modest (the risk-monitoring claim is far better supported).
  • Publication lags are approximated (e.g. 30 days for HICP) rather than reconstructed from true data vintages.
  • Costs strengthen, not weaken, the finding. Transaction costs and ESTR-financed leverage are modelled but simplified; tightening them pushes the timing strategies further below equal-weight.

Testing

A pytest suite under tests/ validates the two claims a reviewer would challenge first — that the risk maths is correct and that the walk-forward labelling is genuinely out-of-sample. Fixtures are synthetic two-regime panels, so the suite is self-contained.

  • Cornish-Fisher VaR — collapses to the Gaussian VaR when skew and excess kurtosis vanish, and inflates the left tail under negative skew / fat tails.
  • HMM regime detection — recovers a known two-regime synthetic series (valid stochastic transition matrix; the "stress" state carries the higher fitted equity-vol mean; Viterbi accuracy > 0.9).
  • No look-ahead (the headline guarantee) — truncating the panel at any date t* leaves every walk-forward label dated ≤ t* unchanged, proving labels never use future information.
pip install -r requirements-dev.txt    # includes hmmlearn
pytest tests/ -q          # 5 tests

Tests run automatically on every push via GitHub Actions (.github/workflows/tests.yml).

License

Use freely for research and educational purposes. No warranty.

About

Regime-aware cross-asset risk platform for European markets: 3-state Gaussian HMM, regime-conditional VaR/ES, walk-forward strategy backtests and a stress engine — with deliberate, honest reporting of a negative result.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages