Skip to content

Add non-equilibrium (NEQ²) alchemical free-energy workflow#102

Open
David-Araripe wants to merge 24 commits into
mainfrom
feat/neq-fep
Open

Add non-equilibrium (NEQ²) alchemical free-energy workflow#102
David-Araripe wants to merge 24 commits into
mainfrom
feat/neq-fep

Conversation

@David-Araripe

Copy link
Copy Markdown
Collaborator

Summary

This PR adds a first-class non-equilibrium (NEQ) alchemical free-energy workflow to Q / QligFEP, termed NEQ² in the accompanying manuscript. It replaces the earlier "code lives in a separate branch" arrangement with a formal implementation integrated into the main codebase: a dedicated MD engine, setupFEP integration behind a --neq flag, a BAR analysis CLI, tests, and an end-to-end tutorial.

NEQ² drives λ continuously from one end state to the other over many short, independent switching trajectories (rather than ~100 fixed-λ equilibrium windows), accumulates the switching work, and recovers ΔΔG from the Bennett Acceptance Ratio (BAR) over the forward/reverse work distributions. Because the switches are independent, they are trivially parallelizable, and the protocol reaches comparable ranking power at roughly 20–40% less compute than the equilibrium QligFEP workflow.

What's included

1. NEQ MD engine (qdyn_neq)

  • New Fortran MD variant src/q6/md_neq.f90, built as the qdyn_neq binary alongside the existing engines. Three modifications relative to the equilibrium engine: (1) allow λ to change during a run, (2) a flexible λ(t) switching schedule (linear or sigmoidal, with tunable steepness L), and (3) accumulation of the non-equilibrium work W over the switch.
  • src/q6/makefile: qdyn_neq added to the all, debug, clean, and nuke targets and the build/move rules, so make all in src/q6 produces qdyn_neq next to qfep, qprep, qdyn, etc.

2. setupFEP --neq generation

New NEQ-specific flags on setupFEP / qligfep (in parser_base.py, threaded through qligfep_cli.py, setupFEP.py, and the QligFEP class):

Flag Meaning Default
--neq Switch QligFEP into non-equilibrium mode (windowed --windows/--sampling are ignored) off
--neq-reps Forward/reverse switching pairs per replicate 5
--neq-steps Length of each λ-switch in MD steps (tNEQ) 50000
--neq-eq-steps Endpoint equilibration spacing between switches (tEQ) 1000
--neq-relax-steps One-time endpoint relaxation before the first switch 5000
-L / --neq-steepness Steepness of the sigmoidal λ schedule 8.0
--neq-schedule sigmoidal or linear sigmoidal

In --neq mode QligFEP writes, per FEP_<lig1>_<lig2> edge: the shared eq1eq5 equilibration files, the one-time endpoint relaxation (relax_0, relax_1), the endpoint-spacing files (eq6_0, eq6_1), and the λ-switching inputs (neq_0, neq_1) carrying the [lambda_scaling] section — instead of the ~100 window files of the equilibrium run. New templates src/QligFEP/INPUTS/neq_endpoint.inp and src/QligFEP/INPUTS/run_neq.sh drive this.

The generated SLURM array script runs eq1eq5 and the endpoint relaxation on MPI cores, then fires the forward (λ 1→0, logged as neq_1) and reverse (λ 0→1, logged as neq_0) switches concurrently, one switch per core with qdyn_neq.

3. BAR analysis CLI (qligfep_neq_analyze)

New module src/QligFEP/analyze_neq.py (console entry point qligfep_neq_analyze). It reads the accumulated work from the switching logs, estimates each leg's free energy with BAR (Brent root-finding via SciPy) and a bootstrap standard error, reports the forward/reverse work-distribution overlap, and combines the legs into ΔΔG = ΔF_protein − ΔF_water. Given the mapping JSON and an experimental key it also attaches experimental ΔΔG, logs correlation metrics (R², Pearson, Spearman, Kendall, RMSE, MAE), and saves the experimental-vs-calculated plot. It additionally emits per-replicate run diagnostics (runtime, seed, SUCCESS/TIMEOUT/OOM/CANCELLED/CRASHED) and switch-completion counts from the SLURM logs.

4. Shared refactors (no behavior change to the equilibrium path)

  • Equilibration-file generation factored out of write_MD_05 into _set_common_md_replacements + write_eq_files, reused by both the equilibrium and NEQ generators.
  • The ΔΔG plotting/result-shaping helper moved to src/QligFEP/analysis_plotting.py, shared by analyze_FEP and analyze_neq.
  • lomap_wrap_cli gains an exp_key option that renames the user's experimental property to the standardized dg_value on the network nodes (validating it is present), so the same experimental key flows through setup and analysis.

5. Tutorial

New tutorials/Tyk2/neq2/README.md: an end-to-end NEQ walkthrough on the Tyk2 benchmark (setup → submission → analysis), reusing the equilibrium tutorial's inputs. The equilibrium tutorials/Tyk2/README.md is updated for the shared steps.

6. Tests & packaging

  • test/neq/: BAR estimator against the Crooks relation, and NEQ input generation + log parsing (runs without the compiled Q binaries).
  • pyproject.toml: adds pandas, a test extra (pytest), pytest config with third-party deprecation-warning filters, and the qligfep_neq_analyze entry point.

Building & using

# build the engine (produces bin/q6/qdyn_neq)
cd src/q6 && make all

# set up an NEQ FEP (see tutorials/Tyk2/neq2 for the full walkthrough)
setupFEP -FF AMBER14sb -r 25 -ts 2fs -j lomap.json -rs 42 -c SNELLIUS \
    --neq --neq-reps 5 --neq-steps 50000 --neq-eq-steps 1000 --neq-relax-steps 5000 \
    -L 8 --neq-schedule sigmoidal

# after the switches finish, analyze
qligfep_neq_analyze -p 2.protein -w 1.water -T 300 -u kcal \
    -j lomap.json -exp ddg_value -t Tyk2 -o neq_results.csv

Reproducibility

Supports the NEQ² manuscript (JACS FEP+ benchmark; NEQ² vs equilibrium QligFEP).

vikmonster and others added 20 commits June 1, 2026 16:18
Non-equilibrium MD variant of md.f90 that drives lambda over the course
of a single simulation (sigmoidal/linear schedule via the [lambda_scaling]
input section) and accumulates the switching work. Builds a serial qdyn_neq
binary alongside the existing qdyn targets.

Engine authored by Viktor Prypoten; imported from the nonEQ branch (PR #73)
without the accompanying pre-generated tutorial files.
Adds a --neq mode (with --neq-reps/--neq-steps/--neq-eq-steps/--neq-steepness/
--neq-schedule) to the shared CLI, plumbed through qligfep and setupFEP. In NEQ
mode QligFEP writes the standard eq1-5 equilibration plus endpoint-equilibration
(eq6_0/eq6_1) and lambda-switching (neq_0/neq_1) templates and a serial run_neq.sh
that loops forward/reverse switches with qdyn_neq, replacing the ~100 fixed-lambda
windows of the equilibrium protocol. The shared eq-file writing is factored into
write_eq_files/_set_common_md_replacements and reused by the equilibrium path.
New qligfep_neq_analyze entry point reads the switching work from the qdyn_neq
logs, estimates per-leg free energies with the Bennett Acceptance Ratio plus a
bootstrap uncertainty and work-distribution overlap, and reports the relative
binding free energy ddG = dF_protein - dF_water. The work-unit handling keeps the
original beta=1 behaviour by default with a documented note flagging the likely
kcal/mol vs kT inconsistency to revisit. Adds pandas as an explicit dependency.
test_bar.py checks BAR recovers a known free energy from Crooks-consistent
synthetic work (and the bootstrap CI brackets it); test_neq_inputs.py checks the
generated eq6/neq inputs carry the right endpoint lambdas and [lambda_scaling]
section, that no windowed md files are emitted, that run_neq.sh calls the serial
qdyn_neq, and that the log parser matches the engine output format.
Adds a Non-equilibrium FEP section that reuses the existing tutorial inputs (no
duplication) and walks through setupFEP --neq and qligfep_neq_analyze.
Two bugs surfaced running qdyn_neq on Snellius:
- the restart/final placeholders RESTART_VAR/FINAL_VAR collided with the run
  script's 'sed s/T_VAR/<temp>/' (T_VAR is a substring of RESTART_VAR), corrupting
  the restart filename; renamed to RESTARTFILE/FINALFILE.
- [files] had no trajectory entry while [intervals] enabled trajectory output, so
  Q aborted with 'Invalid data in input file'; added a trajectory line.

Adds regression tests covering both (the temperature-sed substring collision and
the required [files] trajectory entry).
The NEQ run script inherited --ntasks-per-node from the cluster's MPI config (e.g.
16 on SNELLIUS), but qdyn_neq is serial and uses one core, so the rest were
reserved and billed idle. Force NTASKS=1 and give that core enough memory. Adds a
regression assertion.
Snellius bills the whole node (~16 cores) regardless of --ntasks, so a serial,
one-core NEQ job wastes ~15 cores and ~16x the budget. Use the allocation: run
eq1-5 and the eq6 endpoint chain with the MPI engine (qdynp) across all cores,
then run the independent lambda switches concurrently with the serial engine
(qdyn_neq), one per core via mpirun --bind-to core. Cuts per-task wall-clock and
billing by ~the core count.
mem-per-cpu above the node's per-core memory makes memory the billed resource and
inflates the charge beyond the requested cores. Cap at 1700 MB so a 16-core job is
billed for 16 cores while still giving the concurrent switches ample memory.
@David-Araripe David-Araripe changed the title Add Add non-equilibrium (NEQ²) alchemical free-energy workflow Jul 2, 2026
@David-Araripe

Copy link
Copy Markdown
Collaborator Author

@jesperswillem could you let me know what you think? Feedback on the README.md and the neq tutorial would be appreciated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates a first-class non-equilibrium (NEQ²) alchemical free-energy workflow into QligFEP, including generation (setupFEP --neq), execution scripting (SLURM + qdyn_neq), and BAR-based analysis (qligfep_neq_analyze), plus documentation and tests.

Changes:

  • Add NEQ input generation and a dedicated NEQ SLURM run script, wired through new CLI flags (--neq, --neq-*).
  • Add BAR-based NEQ analysis CLI with bootstrapped uncertainty, overlap diagnostics, and optional experimental comparison/plotting.
  • Add NEQ tutorial material and tests; refactor shared plotting utilities; update build/packaging for new workflow.

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tutorials/Tyk2/README.md Adds NEQ workflow documentation to the main Tyk2 tutorial and updates qlomap usage for experimental data.
tutorials/Tyk2/neq2/README.md New end-to-end NEQ² tutorial.
test/neq/test_neq_inputs.py Tests NEQ input generation + log parsing without compiled binaries.
test/neq/test_bar.py Tests BAR estimator behavior on Crooks-consistent synthetic data.
src/QligFEP/settings/settings.py Adds QDYN_NEQ path substitution for generated scripts.
src/QligFEP/qligfep.py Implements NEQ file writers and NEQ SLURM runfile generation; refactors shared equilibration writing.
src/QligFEP/INPUTS/run_neq.sh New SLURM template for NEQ execution (equilibration + endpoint spacing + concurrent switches).
src/QligFEP/INPUTS/neq_endpoint.inp New template for endpoint and switching input files (with optional [lambda_scaling]).
src/QligFEP/CLI/setupFEP.py Threads NEQ flags into setupFEP submission call generation.
src/QligFEP/CLI/qligfep_cli.py Adds NEQ parameters/flow control to CLI execution and file generation.
src/QligFEP/CLI/parser_base.py Adds NEQ CLI flags and validation metadata.
src/QligFEP/CLI/lomap_wrap_cli.py Adds experimental key support, standardizing node/edge experimental fields.
src/QligFEP/analyze_neq.py New NEQ BAR analysis CLI and supporting utilities.
src/QligFEP/analyze_FEP.py Refactors plotting/result shaping to shared helper; tweaks CLI option naming.
src/QligFEP/analysis_plotting.py New shared plotting + dataframe-shaping module for analysis CLIs.
src/q6/makefile Adds qdyn_neq target to the Q6 build.
README.md Documents NEQ² support and adds updated citation block.
pyproject.toml Adds pandas, pytest configuration, test extra, and NEQ analyzer entry point.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

temperatures=(TEMP_VAR)
seeds=(RANDOM_SEEDS)
runs=${#seeds[@]}
workdir="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
Comment thread src/QligFEP/analysis_plotting.py Outdated
Comment on lines +108 to +110
all_values = avg_values + exp_values
min_val = min(all_values) - margin
max_val = max(all_values) + margin
Comment thread src/QligFEP/analysis_plotting.py Outdated
Comment on lines 531 to 534
parser.add_argument(
"-norun",
"--no_run_data",
"--no-run-data",
dest="no_run_data",
Comment thread tutorials/Tyk2/README.md
Comment thread tutorials/Tyk2/README.md Outdated
Comment on lines +361 to +365
> **Note on work units.** By default the analyzer reproduces the original implementation,
> which treats the switching work as if it were in units of k_BT (`--work-units kT`). The
> work written by `qdyn_neq` is in kcal/mol, so the physically consistent BAR factor is
> `beta = 1/(k_B*T)`; pass `--work-units kcal` for that. This affects the absolute free
> energies and should be confirmed against the original implementation before reporting
Comment thread README.md Outdated
Comment on lines +78 to 82
elif k == "neq_L":
command_str += f" --neq-steepness {v}"
elif k in ("neq_reps", "neq_steps", "neq_eq_steps", "neq_relax_steps", "neq_schedule"):
command_str += f" --{k.replace('_', '-')} {v}"
else:
Comment on lines +200 to +201
if self.exp_key:
edge["ddg_value"] = self.nodes[_to]["dg_value"] - self.nodes[_from]["dg_value"]
… typo

- Tyk2 tutorial: align the forward/reverse lambda arrows with the code convention (forward = neq_1 = lambda 1->0), matching analyze_neq, run_neq.sh and the README's own later description.

- Tyk2 tutorial: correct the work-units note so kcal (beta=1/kT, physically consistent) is described as the default and kT (beta=1) as the compatibility mode, matching DEFAULT_WORK_UNITS.

- README: 'lastest' -> 'latest' in the citation section.
create_ddG_plot had two real defects: axis bounds were computed from the elementwise sum of the calculated and experimental arrays (avg_values + exp_values) instead of their combined range, so points could be clipped off-plot; and the savefig directory branch called Path.isdir(), which does not exist and raises AttributeError. Fix the first with np.concatenate and the second with Path.is_dir().

The stats path also broke on NumPy 2.x: cinnabar 0.2.3's bootstrap_statistic does arr[i] = np.random.normal(size=1), which NumPy 2.0 turned from a filtered DeprecationWarning into a hard ValueError, and the np.trapz fallback in the overlap diagnostic eagerly evaluated np.trapz (removed in NumPy 2.0) as a getattr default. Vendor a small, faithful nonparametric bootstrap (identical method since no error bars are passed), drop the unused cinnabar dependency and its obsolete warning filter, and make the trapz fallback lazy.

Add tests covering the axis-bounds range, the savefig-into-directory path, and the vendored bootstrap point estimates. Use raw strings for the LaTeX axis labels to silence a SyntaxWarning.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants