Add non-equilibrium (NEQ²) alchemical free-energy workflow#102
Open
David-Araripe wants to merge 24 commits into
Open
Add non-equilibrium (NEQ²) alchemical free-energy workflow#102David-Araripe wants to merge 24 commits into
David-Araripe wants to merge 24 commits into
Conversation
Non-equilibrium MD variant of md.f90 that drives lambda over the course of a single simulation (sigmoidal/linear schedule via the [lambda_scaling] input section) and accumulates the switching work. Builds a serial qdyn_neq binary alongside the existing qdyn targets. Engine authored by Viktor Prypoten; imported from the nonEQ branch (PR #73) without the accompanying pre-generated tutorial files.
Adds a --neq mode (with --neq-reps/--neq-steps/--neq-eq-steps/--neq-steepness/ --neq-schedule) to the shared CLI, plumbed through qligfep and setupFEP. In NEQ mode QligFEP writes the standard eq1-5 equilibration plus endpoint-equilibration (eq6_0/eq6_1) and lambda-switching (neq_0/neq_1) templates and a serial run_neq.sh that loops forward/reverse switches with qdyn_neq, replacing the ~100 fixed-lambda windows of the equilibrium protocol. The shared eq-file writing is factored into write_eq_files/_set_common_md_replacements and reused by the equilibrium path.
New qligfep_neq_analyze entry point reads the switching work from the qdyn_neq logs, estimates per-leg free energies with the Bennett Acceptance Ratio plus a bootstrap uncertainty and work-distribution overlap, and reports the relative binding free energy ddG = dF_protein - dF_water. The work-unit handling keeps the original beta=1 behaviour by default with a documented note flagging the likely kcal/mol vs kT inconsistency to revisit. Adds pandas as an explicit dependency.
test_bar.py checks BAR recovers a known free energy from Crooks-consistent synthetic work (and the bootstrap CI brackets it); test_neq_inputs.py checks the generated eq6/neq inputs carry the right endpoint lambdas and [lambda_scaling] section, that no windowed md files are emitted, that run_neq.sh calls the serial qdyn_neq, and that the log parser matches the engine output format.
Adds a Non-equilibrium FEP section that reuses the existing tutorial inputs (no duplication) and walks through setupFEP --neq and qligfep_neq_analyze.
Two bugs surfaced running qdyn_neq on Snellius: - the restart/final placeholders RESTART_VAR/FINAL_VAR collided with the run script's 'sed s/T_VAR/<temp>/' (T_VAR is a substring of RESTART_VAR), corrupting the restart filename; renamed to RESTARTFILE/FINALFILE. - [files] had no trajectory entry while [intervals] enabled trajectory output, so Q aborted with 'Invalid data in input file'; added a trajectory line. Adds regression tests covering both (the temperature-sed substring collision and the required [files] trajectory entry).
The NEQ run script inherited --ntasks-per-node from the cluster's MPI config (e.g. 16 on SNELLIUS), but qdyn_neq is serial and uses one core, so the rest were reserved and billed idle. Force NTASKS=1 and give that core enough memory. Adds a regression assertion.
Snellius bills the whole node (~16 cores) regardless of --ntasks, so a serial, one-core NEQ job wastes ~15 cores and ~16x the budget. Use the allocation: run eq1-5 and the eq6 endpoint chain with the MPI engine (qdynp) across all cores, then run the independent lambda switches concurrently with the serial engine (qdyn_neq), one per core via mpirun --bind-to core. Cuts per-task wall-clock and billing by ~the core count.
mem-per-cpu above the node's per-core memory makes memory the billed resource and inflates the charge beyond the requested cores. Cap at 1700 MB so a 16-core job is billed for 16 cores while still giving the concurrent switches ample memory.
… and update analyze_neq module
Collaborator
Author
|
@jesperswillem could you let me know what you think? Feedback on the README.md and the neq tutorial would be appreciated |
There was a problem hiding this comment.
Pull request overview
This PR integrates a first-class non-equilibrium (NEQ²) alchemical free-energy workflow into QligFEP, including generation (setupFEP --neq), execution scripting (SLURM + qdyn_neq), and BAR-based analysis (qligfep_neq_analyze), plus documentation and tests.
Changes:
- Add NEQ input generation and a dedicated NEQ SLURM run script, wired through new CLI flags (
--neq,--neq-*). - Add BAR-based NEQ analysis CLI with bootstrapped uncertainty, overlap diagnostics, and optional experimental comparison/plotting.
- Add NEQ tutorial material and tests; refactor shared plotting utilities; update build/packaging for new workflow.
Reviewed changes
Copilot reviewed 18 out of 19 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| tutorials/Tyk2/README.md | Adds NEQ workflow documentation to the main Tyk2 tutorial and updates qlomap usage for experimental data. |
| tutorials/Tyk2/neq2/README.md | New end-to-end NEQ² tutorial. |
| test/neq/test_neq_inputs.py | Tests NEQ input generation + log parsing without compiled binaries. |
| test/neq/test_bar.py | Tests BAR estimator behavior on Crooks-consistent synthetic data. |
| src/QligFEP/settings/settings.py | Adds QDYN_NEQ path substitution for generated scripts. |
| src/QligFEP/qligfep.py | Implements NEQ file writers and NEQ SLURM runfile generation; refactors shared equilibration writing. |
| src/QligFEP/INPUTS/run_neq.sh | New SLURM template for NEQ execution (equilibration + endpoint spacing + concurrent switches). |
| src/QligFEP/INPUTS/neq_endpoint.inp | New template for endpoint and switching input files (with optional [lambda_scaling]). |
| src/QligFEP/CLI/setupFEP.py | Threads NEQ flags into setupFEP submission call generation. |
| src/QligFEP/CLI/qligfep_cli.py | Adds NEQ parameters/flow control to CLI execution and file generation. |
| src/QligFEP/CLI/parser_base.py | Adds NEQ CLI flags and validation metadata. |
| src/QligFEP/CLI/lomap_wrap_cli.py | Adds experimental key support, standardizing node/edge experimental fields. |
| src/QligFEP/analyze_neq.py | New NEQ BAR analysis CLI and supporting utilities. |
| src/QligFEP/analyze_FEP.py | Refactors plotting/result shaping to shared helper; tweaks CLI option naming. |
| src/QligFEP/analysis_plotting.py | New shared plotting + dataframe-shaping module for analysis CLIs. |
| src/q6/makefile | Adds qdyn_neq target to the Q6 build. |
| README.md | Documents NEQ² support and adds updated citation block. |
| pyproject.toml | Adds pandas, pytest configuration, test extra, and NEQ analyzer entry point. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| temperatures=(TEMP_VAR) | ||
| seeds=(RANDOM_SEEDS) | ||
| runs=${#seeds[@]} | ||
| workdir="$( cd -P "$( dirname "$SOURCE" )" && pwd )" |
Comment on lines
+108
to
+110
| all_values = avg_values + exp_values | ||
| min_val = min(all_values) - margin | ||
| max_val = max(all_values) + margin |
Comment on lines
531
to
534
| parser.add_argument( | ||
| "-norun", | ||
| "--no_run_data", | ||
| "--no-run-data", | ||
| dest="no_run_data", |
Comment on lines
+361
to
+365
| > **Note on work units.** By default the analyzer reproduces the original implementation, | ||
| > which treats the switching work as if it were in units of k_BT (`--work-units kT`). The | ||
| > work written by `qdyn_neq` is in kcal/mol, so the physically consistent BAR factor is | ||
| > `beta = 1/(k_B*T)`; pass `--work-units kcal` for that. This affects the absolute free | ||
| > energies and should be confirmed against the original implementation before reporting |
Comment on lines
+78
to
82
| elif k == "neq_L": | ||
| command_str += f" --neq-steepness {v}" | ||
| elif k in ("neq_reps", "neq_steps", "neq_eq_steps", "neq_relax_steps", "neq_schedule"): | ||
| command_str += f" --{k.replace('_', '-')} {v}" | ||
| else: |
Comment on lines
+200
to
+201
| if self.exp_key: | ||
| edge["ddg_value"] = self.nodes[_to]["dg_value"] - self.nodes[_from]["dg_value"] |
… typo - Tyk2 tutorial: align the forward/reverse lambda arrows with the code convention (forward = neq_1 = lambda 1->0), matching analyze_neq, run_neq.sh and the README's own later description. - Tyk2 tutorial: correct the work-units note so kcal (beta=1/kT, physically consistent) is described as the default and kT (beta=1) as the compatibility mode, matching DEFAULT_WORK_UNITS. - README: 'lastest' -> 'latest' in the citation section.
create_ddG_plot had two real defects: axis bounds were computed from the elementwise sum of the calculated and experimental arrays (avg_values + exp_values) instead of their combined range, so points could be clipped off-plot; and the savefig directory branch called Path.isdir(), which does not exist and raises AttributeError. Fix the first with np.concatenate and the second with Path.is_dir(). The stats path also broke on NumPy 2.x: cinnabar 0.2.3's bootstrap_statistic does arr[i] = np.random.normal(size=1), which NumPy 2.0 turned from a filtered DeprecationWarning into a hard ValueError, and the np.trapz fallback in the overlap diagnostic eagerly evaluated np.trapz (removed in NumPy 2.0) as a getattr default. Vendor a small, faithful nonparametric bootstrap (identical method since no error bars are passed), drop the unused cinnabar dependency and its obsolete warning filter, and make the trapz fallback lazy. Add tests covering the axis-bounds range, the savefig-into-directory path, and the vendored bootstrap point estimates. Use raw strings for the LaTeX axis labels to silence a SyntaxWarning.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a first-class non-equilibrium (NEQ) alchemical free-energy workflow to Q / QligFEP, termed NEQ² in the accompanying manuscript. It replaces the earlier "code lives in a separate branch" arrangement with a formal implementation integrated into the main codebase: a dedicated MD engine,
setupFEPintegration behind a--neqflag, a BAR analysis CLI, tests, and an end-to-end tutorial.NEQ² drives λ continuously from one end state to the other over many short, independent switching trajectories (rather than ~100 fixed-λ equilibrium windows), accumulates the switching work, and recovers ΔΔG from the Bennett Acceptance Ratio (BAR) over the forward/reverse work distributions. Because the switches are independent, they are trivially parallelizable, and the protocol reaches comparable ranking power at roughly 20–40% less compute than the equilibrium QligFEP workflow.
What's included
1. NEQ MD engine (
qdyn_neq)src/q6/md_neq.f90, built as theqdyn_neqbinary alongside the existing engines. Three modifications relative to the equilibrium engine: (1) allow λ to change during a run, (2) a flexible λ(t) switching schedule (linear or sigmoidal, with tunable steepnessL), and (3) accumulation of the non-equilibrium workWover the switch.src/q6/makefile:qdyn_neqadded to theall,debug,clean, andnuketargets and the build/move rules, somake allinsrc/q6producesqdyn_neqnext toqfep,qprep,qdyn, etc.2.
setupFEP --neqgenerationNew NEQ-specific flags on
setupFEP/qligfep(inparser_base.py, threaded throughqligfep_cli.py,setupFEP.py, and theQligFEPclass):--neq--windows/--samplingare ignored)--neq-reps--neq-steps--neq-eq-steps--neq-relax-steps-L/--neq-steepness--neq-schedulesigmoidalorlinearIn
--neqmode QligFEP writes, perFEP_<lig1>_<lig2>edge: the sharedeq1–eq5equilibration files, the one-time endpoint relaxation (relax_0,relax_1), the endpoint-spacing files (eq6_0,eq6_1), and the λ-switching inputs (neq_0,neq_1) carrying the[lambda_scaling]section — instead of the ~100 window files of the equilibrium run. New templatessrc/QligFEP/INPUTS/neq_endpoint.inpandsrc/QligFEP/INPUTS/run_neq.shdrive this.The generated SLURM array script runs
eq1–eq5and the endpoint relaxation on MPI cores, then fires the forward (λ 1→0, logged asneq_1) and reverse (λ 0→1, logged asneq_0) switches concurrently, one switch per core withqdyn_neq.3. BAR analysis CLI (
qligfep_neq_analyze)New module
src/QligFEP/analyze_neq.py(console entry pointqligfep_neq_analyze). It reads the accumulated work from the switching logs, estimates each leg's free energy with BAR (Brent root-finding via SciPy) and a bootstrap standard error, reports the forward/reverse work-distribution overlap, and combines the legs into ΔΔG = ΔF_protein − ΔF_water. Given the mapping JSON and an experimental key it also attaches experimental ΔΔG, logs correlation metrics (R², Pearson, Spearman, Kendall, RMSE, MAE), and saves the experimental-vs-calculated plot. It additionally emits per-replicate run diagnostics (runtime, seed, SUCCESS/TIMEOUT/OOM/CANCELLED/CRASHED) and switch-completion counts from the SLURM logs.4. Shared refactors (no behavior change to the equilibrium path)
write_MD_05into_set_common_md_replacements+write_eq_files, reused by both the equilibrium and NEQ generators.src/QligFEP/analysis_plotting.py, shared byanalyze_FEPandanalyze_neq.lomap_wrap_cligains anexp_keyoption that renames the user's experimental property to the standardizeddg_valueon the network nodes (validating it is present), so the same experimental key flows through setup and analysis.5. Tutorial
New
tutorials/Tyk2/neq2/README.md: an end-to-end NEQ walkthrough on the Tyk2 benchmark (setup → submission → analysis), reusing the equilibrium tutorial's inputs. The equilibriumtutorials/Tyk2/README.mdis updated for the shared steps.6. Tests & packaging
test/neq/: BAR estimator against the Crooks relation, and NEQ input generation + log parsing (runs without the compiled Q binaries).pyproject.toml: addspandas, atestextra (pytest), pytest config with third-party deprecation-warning filters, and theqligfep_neq_analyzeentry point.Building & using
Reproducibility
Supports the NEQ² manuscript (JACS FEP+ benchmark; NEQ² vs equilibrium QligFEP).