Composable, physics-informed digital twins with calibrated uncertainty and leakage-free validation by default — lightweight, CPU-first.
Otwin builds AI-powered digital twins that stay physics-informed: you bring the physical model structure you know, Otwin estimates the rest from data, attaches calibrated uncertainty, and validates without leakage.
In engineering we have three modelling options:
Otwin works on the physics-informed side — from white-box to grey-box. The black-box end is where current latent-state world models live; Otwin is their observable-state, physics-grounded counterpart — the digital-twins-world-models intersection it targets.
Physical Models or World Models? Black-box models are accurate at interpolation — predicting inside the range of data they have seen — but pushed beyond that data (long horizons, or operating conditions never seen in training) they drift and break physical laws. The physical structure keeps long-range forecasts physically valid; the calibrated interval says how much to trust each one — a stated 90% interval is checked to really hold ~90% of the time.
You choose how much first-principles structure you can write down. Everything after that choice is the same workflow.
Otwin builds white-box and grey-box digital twins: choose a model structure (first-principles or empirical), then one workflow — estimate, quantify uncertainty, validate.
| If you can... | Model structure | Example |
|---|---|---|
| write the system's dynamics from physics | First-principles (port-Hamiltonian) | water tank, DC motor, pumped-hydro storage |
| only state a coarse trend it follows | Empirical law + estimated residual | battery State-of-Health, fatigue |
The first-principles models describe a system as components that exchange energy through ports — so conservation and passivity hold by construction, not by hope. Battery State-of-Health sits at the empirical end: a degradation curve, not an energy-conserving system. Confusing the two is the most common conceptual error.
Core (numpy + scipy only):
# For now, until first release:
pip install git+https://github.com/Javihaus/otwin.git@v2With optional extras:
pip install "otwin[torch]" # Learned first-principles model (PortHamiltonianNN)
pip install "otwin[gp]" # GP-PHS (Gaussian-process uncertainty)
pip install "otwin[viz]" # matplotlib + seaborn + plotly visualization
pip install "otwin[examples]" # Everything to run examples/ (cvxpy, pandas, sklearn, seaborn)
pip install "otwin[dev]" # Testing / linting / typing / docsRequirements:
- Python ≥ 3.10
- Runs on CPU (no GPU required)
- Works on a laptop
import numpy as np
from otwin import DigitalTwin, evaluate
from otwin.systems import water_tank # a ready-made first-principles model
# ── First-principles catalog ──────────────────────────────────────────────
# Available now:
# water_tank fluid system (draining tank)
# mass_spring_damper mechanical oscillator
# Build your own from energy H, interconnection J, dissipation R, input g:
# from otwin.systems import PortHamiltonianSystem # conservative/dissipative system
# from otwin.systems import IrreversiblePHS # adds entropy production
#
# Roadmap (named structures, not yet importable):
# rc_circuit electrical RLC heat_exchanger thermal mass
# reactor chemical reactor (CSTR) degradation empirical fade law
# ──────────────────────────────────────────────────────────────────────────
# 1. Choose a model structure: a first-principles (port-Hamiltonian) system
twin = DigitalTwin(model=water_tank())
# 2. Forecast from an initial state x0 over a time grid t with inputs u
x0 = np.array([1.0])
t = np.linspace(0, 10, 100)
u = np.zeros((100, 1))
fc = twin.forecast(x0, t, u)
print(fc["x"].shape) # (100, 1)
# 3. Validate with a leakage-free protocol + mandatory baselines
report = evaluate(twin, data, protocol="rolling_origin")
print(report) # skill score vs naive baseline, firstCalibrated uncertainty (ensembles / GP) and the full empirical-law workflow
(empirical prior + estimated residual + conformal bands) are shown end-to-end in
examples/battery_soh.
Black-box models (neural ODEs, LSTMs, generic regression) learn unstructured mappings. They interpolate well, but on long horizons or unseen operating conditions they drift, violate conservation laws, and produce unphysical behavior.
A port-Hamiltonian model fixes the structure. The system is described as components that exchange energy through ports:
where:
-
$J(x) = -J(x)^{\top}$ (skew-symmetric$\rightarrow$ lossless interconnection) -
$R(x) \succeq 0$ (positive semidefinite$\rightarrow$ dissipation) -
$H(x)$ is the energy / storage function
Power balance (provable by construction):
With
When you estimate a PortHamiltonianNN from data, the network architecture enforces J skew and R PSD regardless of weights. The guarantee is structural — this is the white-box end of grey-box.
When a system only degrades — capacity fade, wear, fatigue — there is no energy function to conserve. otwin uses a transparent trend law as the model structure, estimates a bounded residual on top, and quantifies uncertainty with horizon-aware conformal intervals:
(empirical fade-law prior + estimated residual)
(band that widens with the forecast horizon)
Same workflow as the first-principles end — only the model structure is lighter (grey-box leaning empirical). This is demonstrated end-to-end on the NASA battery fleet in examples/battery_soh (State-of-Health and Remaining-Useful-Life forecasting). A reusable empirical-law primitive (otwin.systems.degradation) is on the roadmap; today the structure lives in the worked example.
First-principles (port-Hamiltonian / structured state-space) models:
- Mechanical systems (mass-spring-damper, robotics, vehicles)
- Electrical circuits (RLC, power systems)
- Thermal systems (heat exchangers, buildings)
- Chemical reactors (CSTR with thermodynamics, via irreversible PHS)
- Fluid systems (tanks, pipelines)
- Coupled multi-physics systems (via port composition)
Empirical-law models: aging and degradation (battery State-of-Health, fatigue, wear, corrosion).
Out of scope (this is the black-box end):
- High-dimensional pixel / video world models
- Systems with no usable state-space or trend structure
- Workflows requiring a GPU-vendor stack (Omniverse, etc.)
PortHamiltonianSystem: analytic PHS (energy, interconnection, dissipation)IrreversiblePHS: entropy production with σ ≥ 0 (second-law guarantee)PortHamiltonianNN: learned dynamics with enforced structure (J skew, R PSD)- Structure-preserving integrators: implicit-midpoint, discrete-gradient (optional)
- Deep ensembles for
PortHamiltonianNN(real variance, not a constant) - GP-PHS (optional
[gp]): Gaussian process with a structure-preserving kernel - Calibration diagnostics: PIT histograms, coverage curves, recalibration
- Uncertainty is validated for coverage, not assumed
- Temporal splits (rolling-origin, holdout) — the default for forecasting
- Mandatory baselines (persistence, drift, seasonal-naive)
- Skill score (model error ÷ baseline error) as the headline metric
- Metrics: RMSE, MAE, nRMSE, MASE, Theil's U, CRPS, PICP, MPIW
- R² is not a headline metric (use MASE / Theil's U for forecasting)
- Random splits are opt-in with a loud warning ("measures interpolation, not forecasting")
- Port interconnection (connect twins through shared ports)
- Modular: swap analytic ↔ learned ↔ hybrid model structures
- Combine subsystems into multi-physics twins
- Fully typed (
py.typed,mypy --strictclean) - Gating CI (tests / lint / type / coverage ≥ 85% all enforced, no swallowing)
- CPU-first (every example runs in seconds on a laptop)
- Dependency discipline (core = numpy + scipy; optional extras clearly separated)
- Generated benchmarks (every number reproducible, never hand-typed)
See examples/ for full runnable code.
White-box structure preservation (energy, dissipation) with leakage-free validation. See examples/water_tank_phs.
A first-principles (white-box) twin: the structure-preserving forecast keeps energy physical; validated against a persistence baseline.
Result. With the inflow off, the height drains and the stored energy H(x) decays monotonically — the structure-preserving integrator cannot invent energy, so the forecast stays physical at any horizon (skill ≈ 0.94 vs a persistence baseline).
A multi-domain (electrical + mechanical) twin: two energy stores coupled by the gyrator K. The structure-preserving forecast is validated against the closed-form steady state (ω_ss, I_ss), and the stored energy is non-increasing once the voltage is removed (passivity by construction). Structure from van der Schaft & Jeltsema (2014), Example 2.5. See examples/dc_motor.
A first-principles (white-box) twin spanning two physical domains: the numeric steady state matches the analytic ω_ss = VK/(Re·b + K²) to within 0.001%, and energy decays monotonically with the voltage off.
Result. Spin-up under a constant voltage, then coast-down. The numeric angular velocity and current converge exactly to the closed-form steady state (dashed) — the model is validated against an analytic solution, not fitted to data.
A white-box twin of the dominant grid-scale storage technology (~95% of the world's installed long-duration storage). Two reservoirs store gravitational potential energy; a reversible pump-turbine moves water between them. The store is conservative by construction (J = 0, penstock R PSD), so it is validated against closed-form answers: the simulated round-trip efficiency matches η_pump · η_turbine to within 0.05%, energy is held constant while idle (≈0.006% self-discharge over 3 h), and with the valve open it is passive. The storage medium decides the model class — mechanical/hydraulic storage is white-box, electrochemical aging (below) is not. See examples/pumped_hydro.
A first-principles (white-box) twin: a conservative gravitational-energy store with a reversible pump-turbine power port (J = 0, penstock R PSD, g = [1, −1]ᵀ).
Result. A ≈720 MWh charge → hold → generate cycle. The stored energy is exactly conserved while idle, and the numeric round-trip efficiency (≈0.810) matches the closed-form η_pump·η_turbine — validated against an analytic answer, no fitting.
NASA battery fleet: SoH / Remaining-Useful-Life forecasting with a mechanistic fade-law structure (the Wang throughput power law SoH = 1 − c·n^z, whose exponent separates diffusion-limited SEI growth at z ≈ 0.5 from linear wear at z ≈ 1), an estimated bounded residual, and conformal intervals. Not port-Hamiltonian — the empirical end of grey-box. See examples/battery_soh.
An empirical (grey-box) twin: a transparent fade law + an estimated residual + a calibrated band; validated against baselines.
Result. From the split point onward, the physics-informed hybrid tracks the true degradation down through the 80% end-of-life line, while a data-only model (GP) extrapolates the wrong way. The 90% interval is calibrated — it actually covers the realised path.
The calibrated SoH model feeds a receding-horizon (MPC) dispatch optimizer for peak shaving and energy arbitrage. Shows that calibrated uncertainty is what turns predictive maintenance into trustworthy real-time optimization: the robust plan hits its 90% feasibility target at near-maximal value, while a naive plan over-promises every day. See examples/grid_storage_dispatch.
Predictive maintenance feeds real-time optimization — the calibrated SoH twin makes the dispatch trustworthy (re-planned each step).
Result. Dispatching against the same uncertain capacity, the calibrated-UQ (robust) plan leaves 0.0 MWh of demand unmet over the horizon, versus 3.8 MWh degradation-aware and 55.6 MWh naive — calibrated uncertainty is what makes the schedule deliverable, not just optimal on paper.
Planned: irreversible-PHS reactor (CSTR with entropy production) and multi-physics port composition.
- Get started: GETTING_STARTED.md
- Examples:
examples/— water tank, DC motor and pumped-hydro storage (first-principles), battery SoH (empirical), grid-scale storage dispatch - Citations: CITATIONS.md (references with VERIFIED / UNVERIFIED status)
- API docs (Sphinx): source in
docs/— build withmake -C docs html
Current status: Alpha (Development Status :: 3)
Roadmap to beta:
- API stabilization (deprecation policy)
- Extended test coverage (90%+)
- More reference examples (mechanical, thermal, multi-physics)
- A reusable empirical-law (
degradation) model structure - Documentation polish
Roadmap to stable:
- Significant real-world validation
- Production deployments with calibration monitoring
- Formal benchmarking suite (comparison against other libraries)
We will not claim "Production / Stable" until we have earned it.
Contributions welcome. See CONTRIBUTING.md.
Development setup:
git clone https://github.com/Javihaus/otwin.git
cd otwin
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -e ".[dev]"
pytestStandards:
mypy --strictcomplianceruff+blackformatting- ≥ 85% coverage on core modules
- Property-based tests for structural guarantees
- No swallowed failures in CI
If you use otwin in research, please cite:
@software{otwin,
title = {otwin: Physics-informed digital twins with calibrated uncertainty},
author = {Marin, Javier},
year = {2025},
version = {2.0.0-alpha},
url = {https://github.com/Javihaus/otwin}
}See CITATIONS.md for all scientific references.
MIT License. See LICENSE.
This v2 rebuild grew from a Towards Data Science tutorial on hybrid digital twins (v1) that gained traction. v2 is a complete rewrite prioritizing scientific rigor and leakage-free validation. See legacy_v1/README.md for the migration notes.
The v1 tutorial code is preserved in legacy_v1/ for continuity and educational value.









