Feature: add SLAYER and GGJ tearing growth rates#238
Draft
d-burg wants to merge 73 commits into
Draft
Conversation
…(1.6x speedup on Solovev) Implements the dual Riccati matrix S = U₁·U₂⁻¹ as a faster alternative to the standard Euler-Lagrange ODE integration. Enable with `use_riccati = true` in jpec.toml. Integration strategy: uses `sing_der!` (same ODE RHS as standard) with periodic Riccati renormalization S = U₁·U₂⁻¹, U₂ = I in the callback when column norms exceed ucrit. This is mathematically equivalent to the explicit Riccati ODE (dS/dψ = B + A·S - S·D - S·C·S) but numerically stable: the explicit Riccati ODE has quadratic blowup for explicit solvers when K̄·S >> Q, while sing_der! + renorm tracks the bounded ratio S = U₁/U₂. The Riccati crossing (`riccati_cross_ideal_singular_surf!`) skips Gaussian reduction (which can produce NaN/Inf when S is near-zero near the axis) and uses `ipert_res` directly. Benchmarks on Solovev example (N=8, 1 singular surface): Standard ODE: 83.7 ms, 157 steps Riccati ODE: 51.4 ms, 121 steps (1.63x speedup, 0.006% energy difference) See: Glasser (2018) Phys. Plasmas 25, 032507 — Eq. 19 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
## Part 1: Δ' output (tearing stability parameter)
- Add `delta_prime::Vector{ComplexF64}` to `SingType`
- Add `compute_delta_prime_from_ca!` in EulerLagrange.jl, called at end of
`eulerlagrange_integration` (standard path only — see normalization note below)
- Write `singular/delta_prime` as (msing × n_modes) ComplexF64 to HDF5 output in JPEC.jl
- Riccati path does NOT compute delta_prime: ca_l is accumulated in (S,I) normalization
which is inconsistent with the Δ' formula (standard (U1,U2) normalization required)
## Part 2: Parallel Fundamental Matrix (FM) integration
- Add `ChunkPropagator` struct (two N×N×2 blocks for identity-block ICs) in Structs
- Add `use_parallel::Bool = false` control flag in ForceFreeStatesControl
- Add `integrate_propagator_chunk!` — integrates each chunk from IC=(I,0) and IC=(0,I)
independently using BS5 solver, no callback; suitable for Threads.@threads
- Add `apply_propagator!` — in-place 2×2 block matrix multiply on odet.u
- Add `balance_integration_chunks` — sub-divides chunks using ode_itime_cost for
load-balanced parallel work; target = max(2*msing+3, 4*nthreads)
- Add `ode_itime_cost` — log-divergent cost model from STRIDE (Glasser 2018)
- Add `parallel_eulerlagrange_integration` — parallel phase with Threads.@threads,
serial assembly calling renormalize_riccati_inplace! before each crossing (needed
because apply_propagator! gives general (U1,U2) state but riccati crossing expects
(S,I) form); uses ipert_res-direct zeroing to correctly identify the resonant column
- Dispatch from eulerlagrange_integration: use_parallel → use_riccati → standard
## Tests (29 total: 11 Riccati + 18 Parallel FM)
- runtests_riccati.jl: update Δ' test — only standard path populates delta_prime
- runtests_parallel_integration.jl (new): ChunkPropagator identity/linearity,
balance_integration_chunks count/coverage/crossings, ode_itime_cost additivity,
parallel FM energy match (rtol=2%, Solovev)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rallel tests to suite Δ' is now computed inline in riccati_cross_ideal_singular_surf! using the diagonal formula on the bounded (U₁, U₂) state (max ≤ ucrit, no GR permutation). This gives physically correct values: 57.3 and -4.03 for the two Solovev singular surfaces. The standard path does not populate delta_prime — Gaussian Reduction inflates the resonant column's asymptotic coefficients, making ca_l non-physical regardless of when it is computed. A comment in cross_ideal_singular_surf! explains the limitation. Also adds runtests_riccati.jl and runtests_parallel_integration.jl to the default test suite (runtests.jl). Both were previously excluded. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… lacks Δ' The comment in cross_ideal_singular_surf! previously said the issue was GR "normalization inflation." The real reason is more subtle: Δ' is a complex, normalization-convention-dependent quantity. The Riccati renormalization (U₂→I) continuously phases solution columns into a specific gauge where the diagonal formula (ca_r - ca_l)/denom gives physically meaningful values. The standard path's solution columns grow from the axis with an arbitrary complex phase; dividing by the outer asymptotic coefficient normalizes the magnitude but not the complex phase, producing a value in a different convention that does not match what SingularCoupling.jl expects. Also reverts the failed attempt to compute Δ' in cross_ideal_singular_surf! via perm_col + A_outer normalization, which produced -0.10-0.54i vs the Riccati 57.3+58.3i (same physical quantity, incompatible conventions). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…large-N documentation 1. Add SingType.delta_prime_col (N × n_res_modes Matrix) storing the full column (ca_r[:, ipert_res, 2] - ca_l[:, ipert_res, 2]) / (4π²·psio) at each crossing. The diagonal element matches delta_prime[i] exactly. Off-diagonal elements give intra-surface coupling of all N modes to each resonant mode through the singular layer asymptotic expansion. Only populated for Riccati/parallel FM paths. 2. Add singular/m, singular/n, singular/delta_prime_col HDF5 outputs so downstream users can access the full off-diagonal Δ' without needing to index ca_left/ca_right. 3. Document the known numerical limitation of the parallel FM path for large N: FM propagators become ill-conditioned for N ≳ 20 without QR orthogonalization, causing ~10% energy error for DIIID (N=26) with no wall-clock speedup over Riccati. Deferred fix: bidirectional integration or continuous QR (noted in docstring/tests). 4. Update outer-plasma Riccati re-integration (already committed) docstring to match. Tests: 50/50 Riccati+parallel, 84/84 EulerLagrange all pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…trix Implements the STRIDE global boundary value problem for computing the full 2·msing × 2·msing inter-surface tearing stability matrix. Each entry gives the U₂[ipert_res] response amplitude at one surface boundary when driving with unit amplitude at another, encoding cross-surface coupling. Changes: - Riccati.jl: add assemble_fm_matrix (chunk FM product) and compute_delta_prime_matrix! (BVP assembly + solve via STRIDE formulation from Glasser 2018 Phys. Plasmas 25, 032501 Sec. III.B); call from parallel_eulerlagrange_integration - ForceFreeStatesStructs.jl: add delta_prime_matrix field to ForceFreeStatesInternal with docstring - JPEC.jl: write delta_prime_matrix to singular/delta_prime_matrix in HDF5 - test/runtests_parallel_integration.jl: add delta_prime_matrix regression test (shape, finiteness, non-zero diagonal); 30 tests total (was 23) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… for large-N accuracy
The all-forward parallel FM path had ~10% energy error for large-N problems
(DIIID N=26, n=1) because the chunk immediately before each rational surface
crossing integrates into exponentially growing solution territory, producing
an ill-conditioned FM propagator.
Fix: integrate the crossing chunk *backward* (psi_end → psi_start). Solutions
that grow exponentially forward decay backward, yielding a well-conditioned
backward FM Φ_bwd. The accurate forward propagation is recovered as Φ_bwd⁻¹
via a stable LU solve in apply_propagator_inverse!.
The same backward FM is used directly in the Δ' BVP (compute_delta_prime_matrix!)
as Phi_L[j], splitting each ill-conditioned inter-surface FM product into
well-conditioned Phi_R (forward chunks) and Phi_L (backward crossing chunk).
Changes:
- IntegrationChunk: add direction::Int=1 field (+1 forward, -1 backward)
- chunk_el_integration_bounds: add bidirectional=false kwarg; crossing chunks
get direction=-1 when true
- balance_integration_chunks: left sub-chunk always direction=1; right inherits
chunk.direction so the near-singularity chunk stays backward after splitting
- integrate_propagator_chunk!: reverses tspan for direction=-1 chunks
- apply_propagator_inverse!: new function, LU solve Φ_bwd·x = u_old
- Serial assembly: branches on chunk.direction (inverse vs forward apply)
- parallel_eulerlagrange_integration: passes bidirectional=true
- compute_delta_prime_matrix!: BVP now uses Phi_R·x_right - Phi_L·x_left = 0
at each junction instead of ill-conditioned monolithic Phi_segs product
- assemble_fm_matrix: safe for empty idx_range (uses propagators[1] for N)
Results (et[1] stability eigenvalue):
Solovev N=8: 0.006% error (was already fine)
DIIID N=26: 0.236% error (was ~10.5% — 44× accuracy improvement)
Tests: 31/31 pass in runtests_parallel_integration.jl (+1 DIIID accuracy test)
18/18 pass in runtests_riccati.jl (unchanged)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds benchmarks/benchmark_threads.jl to measure wall-clock time and accuracy of the standard, Riccati, and parallel FM integration paths across varying thread counts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three fixes from code review of PR #178: - assemble_fm_matrix: add explicit isempty guard before the propagator loop so an empty idx_range (e.g. i_crossings[1]==1) returns the identity matrix without relying on the loop falling through silently. - compute_delta_prime_matrix!: add @Assert at function entry that all singular surfaces have exactly one resonant mode, so multi-resonance surfaces fail loudly instead of silently using only sp.m[1]/sp.n[1]. - runtests_parallel_integration.jl: remove stale comment that described large-N FM ill-conditioning as an open problem with ~10% energy error; bidirectional integration (now the default for use_parallel=true) has resolved this. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er! and compute_delta_prime_from_ca! Two developer benchmark scripts for verifying the two dead-code reference implementations flagged in the Claude code review of PR #178: benchmarks/benchmark_riccati_der.jl Verifies riccati_der! correctly evaluates Glasser 2018 Eq. 19: dS/dψ = w†·F̄⁻¹·w - S·Ḡ·S, w = Q - K̄·S Uses Hermitian test states (physical constraint: the EL system preserves S†=S from the axis) and compares riccati_der! against manual evaluation of the same formula using the ffit splines directly. Observed error: ~1e-17 (machine epsilon). No TOML flags needed. benchmarks/benchmark_delta_prime_methods.jl Verifies compute_delta_prime_from_ca! gives bit-for-bit identical Δ' values to the inline computation in riccati_cross_ideal_singular_surf!. Both apply the same diagonal formula to the same ca_l/ca_r arrays, so the result must be exactly zero difference. Observed difference: 0.0 (exact). No TOML flags needed. Neither script requires new TOML flags: they call internal functions directly without going through ForceFreeStatesControl. Developer-only knobs belong in scripts, not in user-facing config. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tring The previous "O(Δψ)" phrasing in the Integration Strategy section read as a global accuracy statement, suggesting the Riccati path is only first-order accurate. This is wrong: the method integrates the linear EL ODE with Tsit5 (5th-order) and recovers S = U₁·U₂⁻¹ by exact renormalization, achieving the full ODE solver reltol. Rewrite the section in three clearly labelled parts: - Why riccati_der! (quadratic ODE) is avoided: relative error control is unfaithful when |S| is large, not a step-size problem, not fixable by adaptation without an implicit solver. - What the implementation actually does: sing_der! (linear ODE, exact RHS), Tsit5 (5th-order), exact renormalization, same global accuracy as standard. - Local consistency analysis: the O(Δψ) expansion is retained but now labelled explicitly as a consistency check, not an accuracy claim. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…setup + new unit tests
Two changes in one pass:
Shared setup (performance):
equil (Grad-Shafranov) and ffit (metric matrices) are now built once and
shared across all integration-dependent testsets via a make_solovev_intr
helper for cheap fresh intr construction. Previously, setup_equilibrium +
make_metric + make_matrix ran 4 times and riccati_eulerlagrange_integration
ran 3 times. Now each runs once, cutting total test time significantly.
New unit tests (dead code coverage):
"riccati_der! formula — Glasser 2018 Eq. 19": verifies riccati_der!
correctly evaluates dS/dψ = w†F̄⁻¹w − SGS at several ψ points using
Hermitian test states (physical constraint). Agrees with manual formula
evaluation to machine precision (~1e-17). No extra integration needed.
"compute_delta_prime_from_ca! matches inline Δ'": verifies the standalone
Δ' formula gives bit-for-bit identical results to the inline computation
in riccati_cross_ideal_singular_surf!. Reuses the shared odet_ric.
Total: 23 tests (was 18), runtime ~51s (was ~80s+ with redundant setup).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dd 3 unit tests - Delete unused parallel_threads field from ForceFreeStatesControl: the field was silently ignored (Threads.@threads uses JULIA_NUM_THREADS at startup, not a runtime field). Removes false impression that thread count can be set from jpec.toml. - Add apply_propagator_inverse! round-trip unit test: verifies Φ⁻¹·Φ = I algebraically, complementing the existing apply_propagator! identity and linearity tests. - Add chunk_el_integration_bounds direction field test: verifies bidirectional=true sets direction=-1 on crossing chunks and direction=+1 on non-crossing chunks, and that balance_integration_chunks preserves direction correctly (right sub-chunk inherits, left sub-chunk always +1). Catches direction propagation regressions. - Add delta_prime_matrix DIIID regression test: verifies the STRIDE BVP Δ' matrix is finite and non-zero for the large-N case (N≈26, multiple rational surfaces), where ill-conditioned (non-bidirectional) FM propagators would produce NaN/Inf entries. 56/56 parallel integration tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…f/riccati One conflict resolved in src/GeneralizedPerturbedEquilibrium.jl (was src/JPEC.jl): write_outputs_to_HDF5 vacuum section. Resolution: - Keep new Δ' HDF5 outputs from perf/riccati (singular/m, singular/n, delta_prime, delta_prime_col, delta_prime_matrix) - Adopt develop's vacuum output format: vac_data variable name, plasma_pts/wall_pts fields (3D Cartesian), y_plasma/y_wall entries, always-write pattern with empty arrays All other files (EulerLagrange.jl, ForceFreeStatesStructs.jl, runtests.jl) auto-merged cleanly. Default HDF5 filename updated to gpec.h5. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…um references post-rename Update test files and benchmarks to use the new package name and config filename (gpec.toml) following the GPEC rename merged from develop: - test/runtests_riccati.jl - test/runtests_parallel_integration.jl - benchmarks/benchmark_threads.jl - benchmarks/benchmark_riccati_der.jl - benchmarks/benchmark_delta_prime_methods.jl 23/23 riccati tests and 56/56 parallel integration tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…page Creates docs/src/stability.md covering the ForceFreeStates module: - Newcomb/DCON ideal MHD stability criterion with paper citations (Glasser 2016 Phys. Plasmas 23 112506, 2018a 032507, 2018b 032501) - Standard, Riccati, and parallel FM integration methods - Bidirectional integration strategy for large-N accuracy - Δ' tearing parameter: per-surface (delta_prime/delta_prime_col) and inter-surface matrix (delta_prime_matrix / STRIDE BVP) - Configuration reference, API autodocs block, example usage Adds page to docs/make.jl navigation and cross-links from equilibrium.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…kdown links)
1. Add Random stdlib to Project.toml [deps] and [compat] — required by
runtests_riccati.jl but missing from declared dependencies, causing
CI failure with "Package Random not found in current path".
2. Fix docstring markdown in Riccati.jl and ForceFreeStatesStructs.jl:
- Wrap bare [array_notation] (link text) immediately followed by
(description) (parsed as URL) in code fences to prevent Documenter
from treating them as broken local links.
- Affected: assemble_fm_matrix BVP unknowns block, Phi_L/Phi_R
equations, and VacuumData plasma_pts/wall_pts field descriptions.
These were surfaced by the new @autodocs block in stability.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nt logging Three targeted fixes from pre-merge code review: 1. Threads.@threads :static — since Julia 1.7, the default :dynamic scheduler can migrate tasks between OS threads mid-execution, making Threads.threadid() unreliable for indexing into odet_proxies. Using :static guarantees a 1:1 task-to-thread mapping for the parallel FM integration phase. 2. outer_chunk psi_end guard — the outer-plasma re-integration in parallel_eulerlagrange_integration now uses psilim*(1-eps) to match the guard applied by chunk_el_integration_bounds, avoiding a potential boundary evaluation at exactly psilim. 3. Replace println with @info/@warn — all verbose-mode output in Riccati.jl now uses Julia logging macros, consistent with EulerLagrange.jl. This enables log-level filtering and suppression in tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adapted from R. Fitzpatrick's TJ code. tj_run integrates the (ψ, g₂, H₁, H₁', f₃) shape ODE and returns an InverseRunInput with Shafranov-shifted-circle flux surfaces; tj_run_direct builds a 257×257 ψ(R, Z) grid and returns a DirectRunInput so the equilibrium is processed by the same direct-GS pipeline used for TJ geqdsks. Direct-GS path includes the εa³·L(r)·cos(w) / −εa³·L·sin(w) shape terms in the (R, Z) → (r, w) Newton inversion (EFIT.cpp) and reproduces the ideal-kink pole approach at ε ≈ 0.66 to sub-percent accuracy vs the TJ geqdsk branch. Also fixes: * lar_run and tj_run: pass ψ_N (not physical r) as InverseRunInput.rz_in_xs per the struct contract — silently worked only when lar_a = 1 * dψ/dr normalization: a² not a (broken for any a ≠ 1) * Restores dy[1], dy[2] in lar_der that were dropped mid-session
… matrix Adds parallel_eulerlagrange_integration and riccati_eulerlagrange_integration driving a STRIDE-style global BVP for the multi-surface Δ' matrix (singular/delta_prime_matrix, shape msing × msing after PEST3 four-term combination). Bidirectional FM integration and Double64 BVP solve for well-conditioned large-N. Also: * eulerlagrange_integration now returns 4-tuple (odet, propagators, chunks, S_left); call sites updated in tests and benchmarks * Gate @info diagnostic dumps in Sing.jl and Riccati.jl behind ctrl.verbose * Restore SingularException guard in findmax_dW_edge! * Remove empty cross_kinetic_singular_surf() stub and dead kmsing/kinsing fields
* examples/LAR_epsilon_scan and LAR_beta_scan: TJ-analytic scans with power-law- warped grids (dense near pole); epsilon uses Option B tj_direct path * examples/TJ_epsilon_pole_example: minimal near-pole (ε = 0.66) config used by the regression harness * regression-harness/cases/tj_epsilon_pole.toml: anchors Δ' matrix and δW_t near-pole values so εa³·L regressions in tj_run_direct are caught * test/runtests_tj_analytic.jl: 16 assertions covering tj_run, tj_run_direct, and the ψ(R, Z) endpoint consistency
* Project.toml: remove unused JSON and Random (no imports in src/) * Remove EquilibriumConfig.use_galgrid (Galerkin grid feature removed upstream) * Restore .github/workflows/auto-merge.yaml * Fix jpec.toml → gpec.toml in Riccati.jl docstrings * Scrub 'See sas_flag' comments → set_psilim_via_dmlim across gpec.toml examples * docs/src/stability.md: update API example to 4-tuple and Δ' matrix shape * docs/src/equilibrium.md: remove dangling splines.md / vacuum.md links * examples/LAR_*_scan: update headers and delete unused lar.toml stubs
…omments and docs Line numbers drift as soon as upstream is edited. Replace cross-references like 'Equilibrium.cpp rhs_chooser=1 dy[1]', 'sing.F lines 394-398', 'ode.F:1020', 'Riccati.jl:691', etc. with prose referring to 'Fortran STRIDE' or 'TJ' and the file name only. No functional changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolves conflicts with recent develop work: - ForceFreeStatesStructs.jl: adopt develop's kinetic_factor/kinetic_source control (replacing the old con_flag/kin_flag pair), and incorporate parallel_threads field. Drop deprecated set_psilim_via_dmlim/dmlim. - EulerLagrange.jl: gate the rational-surface cross on ctrl.kinetic_factor == 0 (instead of !ctrl.con_flag), while retaining this branch's sing_asymp_left/right distinction needed for the parallel FM Δ' BVP. - Sing.jl: merge develop's multi-line compute_sing_mmat! signature with this branch's sig::Float64=1.0 kwarg. - Riccati.jl, GeneralizedPerturbedEquilibrium.jl: rename !ctrl.con_flag → ctrl.kinetic_factor == 0, and kin_flag stubs → kinetic_factor > 0. - Example / regression gpec.toml files: keep psiedge and drop the deprecated dmlim pair that was already removed on develop. - Project.toml / Manifest.toml: pick up QuadGK dependency. Verified: Solovev_ideal_example runs end-to-end through all pipeline stages.
…n truncation
The edge-dW scan over ψ ∈ [psiedge, psilim] was doing double duty: reporting
the dW peak location (a diagnostic) AND silently moving psilim/qlim/u to that
peak (a truncation that reshaped the Δ' BVP and δW eigenvalues). In benchmark
runs against Fortran STRIDE, the silent truncation corrupted the outermost
rational's Δ' by tens of percent depending on where the peak happened to fall
inside the band — e.g. on the LAR ε-scan at ε≈0.4, Δ'(3/1) shifted from the
correct ≈1.8 down to ≈0.85 (>50 % error). The truncation also silently
depended on psiedge itself, so going from psiedge=1.0 → 0.99 was a behavioral
cliff rather than a smooth tightening of the edge band.
Split the behavior into two paths at three call sites (ForceFreeStates/
EulerLagrange.jl and ForceFreeStates/Riccati.jl ×2):
* Default (truncate_at_dW_peak=false): edge scan is diagnostic-only. Runs
findmax_dW_edge! with the resulting dW(ψ), ψ, q, and energy components
stored on odet.edge_scan and written to HDF5 under edge_scan/. psilim,
qlim, and odet.u are restored to the post-integration values so that Δ'
and free-boundary eigenvalues are determined solely by qhigh / psihigh /
dmlim. ψ_peak is logged at verbose level.
* Legacy (truncate_at_dW_peak=true): reproduces the original Fortran
ode_record_edge heuristic. After the diagnostic scan, psilim, qlim, and
odet.u are pulled back to the dW-peak step. Preserved so that future
work on a more robust edge-mode filter can build on top of it, with a
warning in the docstring and log line that Δ' and δW are unreliable in
this mode.
Docstring update on ForceFreeStatesControl.psiedge / truncate_at_dW_peak
spells out the diagnostic vs legacy semantics and the reliability caveat.
test/runtests_fullruns.jl: update the Solovev kinetic multi-n expected et[1]
from -0.01248 to -0.19359 with an inline comment. The old value reflected
the truncated-integration behaviour; the new value reflects the full-domain
answer. Other fullruns tests unchanged.
Validated against Fortran STRIDE β-scan (42 pts) and ε-scan (56 pts) on
identical TJ geqdsk equilibria: Δ'(2/1), Δ'(3/1), δWp, δWv, δWt all match
within numerical noise away from the ideal pole; median smoothness
residuals beat Fortran on all 6 tracked quantities.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ownstream Δ' matrix Defaults updated for SLAYER/GGJ downstream consumption: - etol 1e-7 → 1e-10 (equilibrium convergence) - eulerlagrange_tolerance 1e-7 → 1e-8 - singfac_min 0 → 1e-4 (required non-zero on the parallel path) - sing_order 2 → 6 (STRIDE convention for Δ') - use_parallel false → true (unlocks singular/delta_prime_matrix) - Add set_psilim_via_dmlim + dmlim controls in sing_lim! (Fortran sas_flag equivalent) for single-n truncation beyond the outermost rational surface Test fixes: runtests_slayer_params / runtests_slayer_inputs updated for the params.f sign convention Q_i = -tauk·ω*_i (both Q's share the same sign structure; earlier tests held the layerinputs.f Q_i sign flip which we deliberately do not mirror). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… η in GGJ & SLAYER Adds a shared Spitzer/Sauter/Redl resistivity closure so GGJ and SLAYER can both consume the same neoclassical η formula: - src/Utilities/NeoclassicalResistivity.jl (new): SpitzerModel / SauterNeoModel / RedlNeoModel tag types, coulomb_log_e (NRL/Sauter/ Wesson forms), eta_spitzer (Sauter 1999 Eq. 18a), trapped_fraction (Lin-Liu & Miller 1995 full form) + trapped_fraction_eps fallback, nu_star_e (Sauter 1999 Eq. 18b), and eta_neoclassical dispatched on the model (F₃₃ via Sauter 1999 Eq. 13 or Redl 2021 Eq. 17). - src/ForceFreeStates/ResistEval.jl: ResistGeometry struct extended with avg_B, B_max, B_min, f_trap, R_major, eps_local. Populated inside the existing θ-loop at essentially zero cost (one extra integrand + running min/max over B and R). - src/Tearing/InnerLayer/GGJ/LayerInputs.jl: build_ggj_inputs grows `resistivity_model::NeoResistivityModel=SpitzerModel()` and `lnLambda_form::Symbol=:nrl` kwargs. Uses the shared closure; default Spitzer switches from Wesson 1.65e-9·lnΛ form to Sauter-18a (Zeff-aware, ~1% agreement at Zeff=1). - src/Tearing/InnerLayer/SLAYER/LayerParameters.jl + LayerInputs.jl: same `resistivity_model` kwarg, plus optional f_trap / nu_e_star / R_major_eff / lnLambda_form. Defaults to SpitzerModel() + :wesson so legacy SLAYER η is bit-identical. When a neoclassical model is selected, build_slayer_inputs pulls f_trap + R_major + eps_local from sing.restype if populated, and computes ν*_e via the shared utility. Validated on DIII-D 147131 @ 2300 ms (ideal example) vs OMFIT utils_fusion.py and OFT bootstrap.py F₃₃ formulas: max |reldiff| = 1.8e-16 across all 4 rational surfaces for lnΛ, ν*_e, η_Sp, η_Sauter, η_Redl, F₃₃(Sauter), F₃₃(Redl). Benchmark lives at CTM-processing/julia_vs_fortran/neoclassical_resistivity_benchmark/. In the DIII-D banana regime (q=2,3,4), η_Sauter/η_Sp ≈ 4–5× — the expected trapped-particle enhancement for H-mode tearing studies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…_prime_raw + pest3_decompose The STRIDE-BVP Δ' computation already assembles a 2m×2m side-major matrix dp_raw in compute_delta_prime_matrix! (Riccati.jl:779, ordering [L_s1, R_s1, L_s2, R_s2, …]), then collapses it to the m×m PEST3 odd-parity Δ' projection via deltap[i,j] = dp_raw[2i,2j] − dp_raw[2i,2j-1] − dp_raw[2i-1,2j] + dp_raw[2i-1,2j-1] (the (L−R)(L−R)^T combination). The A' (even-parity interchange), B', Γ' (off-parity) blocks are thrown away. This commit retains the full 2m×2m matrix: - New ForceFreeStatesInternal.delta_prime_raw field (side-major, byte- compatible with Fortran rdcon/gal.f::gal_write_delta top 2msing×2msing block of delta_gw.dat; no ½ prefactor per Fortran convention). - Populated right before PEST3 collapse at Riccati.jl:819. - Persisted as singular/delta_prime_raw in gpec.h5. - New pest3_decompose(dp_raw) → (A, B, Γ, Δ) and dprime_outer_matrix helpers, matching Fortran rdcon/gal.f:1728-1743 recombination. Needed for the full det(D' − D(γ)) = 0 tearing+interchange eigenvalue problem in Phase C. Sanity-checked on DIII-D: pest3_decompose(dp_raw).Δ matches the existing m×m delta_prime_matrix to 4.6e-14. Cross-check vs Fortran delta_gw.dat shows pre-existing dpsi^α normalization gap (neither code writes the Hermitian form; it's applied at use-time). Benchmark artefacts at CTM-processing/julia_vs_fortran/ggj_coefficients_benchmark/ dprime_raw_crosscheck/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…GGJ parity channel selection
Replaces solve_inner's anonymous SVector{2,ComplexF64} return with a named
struct InnerLayerResponse(tearing, interchange) to eliminate a latent
parity-channel bug and self-document the inner-layer API.
The bug: the old contract said "(Δ_odd, Δ_even)" but the word "odd"/"even"
is used inconsistently across the literature — GWP 2016 labels parity by
the symmetry of the flux W (odd-W = interchange, even-W = tearing), while
Fortran rmatch/deltac.f labels by the velocity+temperature (odd-NΘ = tearing,
even-NΘ = interchange). These give OPPOSITE parity names for the same
physics channel. The GGJ Galerkin solver mirrored deltac.f's end-of-routine
swap (Galerkin.jl:711-712), putting index 1 = interchange. The GGJ Shooting
solver mirrored deltar.f, putting index 1 = interchange. SLAYER put its
pressureless tearing Δ at index 1. Meanwhile Dispersion/Coupled.jl:96 and
Dispersion/SurfaceCoupling.jl:46 hardcoded [1] — so for SLAYER surfaces
they correctly picked the tearing channel, but for GGJ surfaces they
silently picked the INTERCHANGE (Glasser-stabilization) channel instead of
the tearing drive. Any GGJ multi-surface dispersion scan run prior to this
commit was solving the wrong eigenvalue problem.
Fix:
- New InnerLayerResponse struct with physics-named tearing/interchange fields.
- GGJ Galerkin: removed the deltac.f swap; isol=1 (W'(0)=0 → W even, sheet
current, tearing) maps to .tearing; isol=2 (W(0)=0 → W odd, non-reconnecting)
maps to .interchange. Per-solver parity derivation documented in BC comments.
- GGJ Shooting: traced match/matrix.f::matrix_layer sign-symmetric vs
sign-antisymmetric constraints to confirm deltar(1)=interchange, deltar(2)=
tearing; remapped _delta_from_c0 output into named fields accordingly.
- SLAYER: pressureless Fitzpatrick has no interchange channel →
InnerLayerResponse(Δ, 0).
- Dispersion/Coupled.jl + SurfaceCoupling.jl: replaced solve_inner(...)[1]
with solve_inner(...).tearing at both call sites.
- 6 test files updated: synthetic test models return InnerLayerResponse;
real SLAYER/GGJ callers use .tearing. 200+ tests pass; 2 pre-existing
slayer_riccati failures (D_norm threshold drift, unrelated to parity
refactor) verified by git-stash bisection.
Naming: chose tearing/interchange per user decision — more self-documenting
than odd/even which depends on whose parity convention you're reading.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…matrix
Companion to the m×m MultiSurfaceCoupling (tearing-only) that was shipped
earlier in the perf/slayer-growthrates branch. CoupledFull generalizes to
the full Pletzer-Dewar 1991 / GWP 2016 tearing+interchange eigenvalue
problem needed to include Glasser stabilization in the GGJ model.
Structure:
- MultiSurfaceCouplingFull holds a 2m×2m D' matrix in parity-major
ordering [[A' B'] [Γ' Δ']], a per-surface Vector{SurfaceCoupling},
reference-surface index, and msing_max truncation. Built via
multi_surface_coupling_full(surfaces, dp_full; ref_idx, msing_max).
- Evaluation mc(Q) subtracts a 2m×2m block-diagonal D(γ) with
interchange-channel response on the upper-left m diagonal and
tearing-channel response on the lower-right m diagonal. Each
channel rescaled by per-surface tauk_ref/tauk_k and sc.scale; sc.dc
critical offset subtracted from the tearing channel only.
Tests (20): constructor validation, pressureless SLAYER-like reduction
to det(A')·det(Δ'−Δ_t) via block-diagonal outer, Schur-complement
identity for the full coupling case, Q-rescaling via tauk ratios,
interchange-channel physical activation, dprime_outer_matrix round-trip
against pest3_decompose, msing_max truncation preserves parity-block
structure.
Paired with a Julia↔Fortran inner-layer GGJ Galerkin benchmark (at
CTM-processing/julia_vs_fortran/inner_layer_benchmark/) that runs
rmatch's deltac_run qscan on the DIII-D resistive example and the
matching Julia solve_inner(GGJModel(:galerkin), ...) at identical
(E,F,G,H,K,M,τ_A,τ_R,v1) inputs and Q grid. The benchmark finds a
uniform 2.10× factor Julia/Fortran across BOTH channels and ALL Q
(not a pole/convergence artifact) — to be investigated as a follow-up;
the eigenvalue problem topology is insensitive to this uniform factor
so the CoupledFull machinery is usable as-is for root finding via
contour-intersection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…matrix
Adds MultiSurfaceCouplingFortran — a literal Julia port of Fortran
rmatch/match.f::match_delta (fulldomain=0 branch). This is the full
Pletzer-Dewar 4m×4m tearing+interchange coupled dispersion matrix, with
the inner-layer amplitudes d^j_± kept as explicit DOFs alongside the
outer-region amplitudes C^j_{L,R}, coupled by the ±1 matching identity
C^j_L = d^j_+ − d^j_-
C^j_R = −(d^j_+ + d^j_-)
Motivation: the naive 2m×2m form det(D' − diag(Δ_int, Δ_tear)) = 0
(shipped earlier as CoupledFull) is structurally incorrect because
D' lives in the (L,R) side-major basis while the inner-layer output
(Δ_tearing, Δ_interchange) lives in the (+,-) parity basis. The two
cannot be subtracted directly without an explicit basis transform
(Wang-Glasser-Brennan-Liu-Park 2020, Phys. Plasmas 27, 122503,
Eq. 11a-11d). Fortran rmatch avoids the transform by keeping both sets
of amplitudes alive in a 4m-DOF linear system. This commit mirrors that
choice.
Validation on DIII-D resistive example (n=1, msing=4):
- Julia 4m×4m |det| ∈ [4.6e31, 3.5e39] vs Fortran rmatch
[4.0e32, 6.3e36] — same order of magnitude in the same regions.
- Same dipolar pole structure at origin, same green/magenta contour
sign-change network in both codes. Julia shows some extra contour
noise in the lower half-plane consistent with the known uniform
2.10× inner-layer factor + STRIDE-BVP vs Galerkin outer-solve drift
(both documented in CTM-processing/julia_vs_fortran/
inner_layer_benchmark/FINDINGS.md).
CoupledFull (2m×2m) stays untouched — it remains exported for reference
and its 20 tests still pass, but its determinant values should not be
used for physical root finding. Use multi_surface_coupling_fortran for
that.
The patched Fortran rmatch (match_detgrid subroutine added for
apples-to-apples grid scans) lives in ../GPEC/rmatch/match.f in the
user's local tree; not part of this commit.
26 new unit tests in runtests_dispersion_coupled_fortran.jl covering
constructor validation, 1-surface 4x4 hand-verified determinant,
2-surface Fortran-assembly equivalence, Q rotation shift, scale
factor, msing_max truncation, pressureless (SLAYER-like) smoke test,
GGJ-like m=3 smoke test.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds an `inner_kwargs::NamedTuple` field to `MultiSurfaceCouplingFortran` so callers can forward Galerkin grid-tuning parameters (pfac, xfac, nx, nq) to `solve_inner` at every Q evaluation. Matches the Fortran rmatch `&DELTAC_LIST` namelist convention and enables apples-to-apples Julia↔ Fortran dispersion comparisons. Added test verifies the kwarg reaches solve_inner. All 31 existing CoupledFortranMatch tests continue to pass. Context: investigation of the apparent 2.091× Julia↔Fortran discrepancy on DIII-D GGJ inner-layer output revealed it was a **benchmark configuration error**, not a code bug. Fortran rmatch rescales τ_R by η_rdcon/η_user at match.f:212-213 (a deliberate optimization for the η-scan workflow — lets users rerun rmatch at different resistivity without redoing rdcon). When our Julia benchmark drivers fed the raw τ_R from delta_gw.dat into GGJParameters, they were comparing Julia at the "rdcon resistivity" to Fortran at the rmatch.in resistivity. Fix: set rmatch.in::eta to match the value baked into delta_gw.dat. With matched eta, Julia↔Fortran agree to 0.4% across all Q and both channels, with clean 4m×4m determinant agreement in the detgrid benchmark (192×192 narrow-box scan, |det| ranges overlap to < 0.5%). Benchmark updates (in CTM-processing sibling repo, untracked): - run_fortran_deltac_qscan.py + run_fortran_detgrid.py: eta forced to match delta_gw.dat (5.089e-9) - compare_detgrid.py: SLAYER-convention axes (growth on y, rotation on x) and 3-panel layout (Fortran 4m×4m, Julia 4m×4m, Julia m×m — dropped the CoupledFull 2m×2m since it was shown to be structurally wrong). - FINDINGS.md: full write-up of the eta-rescale root cause. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Overhaul of `build_slayer_inputs` + `solve_inner(::SLAYERModel{:fitzpatrick})`
so that Julia and Fortran SLAYER produce identical coupled-dispersion
det(Q) scans at every plot-frame Q, on the same (geqdsk, kinetic file,
slayer.in namelist) inputs. Verified by quantitative 4-hypothesis test
at TJ ε=0.001 and β=0.1 benchmark cases:
hypothesis median Re median Im
J(Q) ~ F(Q) identity +1.01 +1.02 <- eps
J(Q) ~ F(Q) identity +0.99 +1.01 <- beta
(the three reflection hypotheses all give off-axis ratios)
Before this patch the eps_0.001 ratio was (+1.10, -0.98) — a clean
Im-axis reflection in Riccati p-space that produced a visually
"flipped-about-ω=0" magenta (Im det=0) contour despite all normalized
SLAYER parameters (τ_k, S, D_norm, P_perp, P_tor, Q_e, Q_i, d_beta)
matching Fortran to <1%.
### `LayerInputs.jl::build_slayer_inputs`
Four new kwargs + internal ω_*e/ω_*i computation (port of Fortran
`slayer/layerinputs.f:456-459`):
* `bt` now also supports a scalar override
in addition to a callable or `nothing` (F-spline default).
* `R0 = nothing` override magnetic-axis R; default
`equil.ro`. Lets the benchmark driver pass the geqdsk RMAXIS
literal so both codes use the same reference axis.
* `rs_method = :midplane` keeps original θ=0 outboard-midplane
chord behaviour by default; `:fsa` activates a θ-mean of
√rzphi_rsquared that matches Fortran STRIDE's `issurfint` /
`a_surf` flux-surface-averaged minor radius.
* `z_i = 1.0` ion charge for the diamagnetic
formula; hardcoded to 1 for main D ion in Fortran
`layerinputs.f:399`.
* `compute_omega_star = true` when `true`, per-surface ω_*e / ω_*i
are re-derived from cubic-spline derivatives of (n_e, T_e, T_i)
carried in `profiles`, using χ₁ = 2π·equil.psio and the formulae
ω_*e = (2π/χ₁)·(T_e·dn_e/dψ / n_e + dT_e/dψ)
ω_*i = -(2π/(z_i·χ₁))·(T_i·dn_e/dψ / n_e + dT_i/dψ)
(the main-ion density is taken equal to n_e by quasi-neutrality,
matching the gpeckf staging convention and Fortran's kin%f(1)
after read_kin). Fortran's elementary-charge `e` cancels when
T_e, T_i are in eV and dT/dψ is scaled by e, giving the
equivalent form above. Setting `compute_omega_star=false`
preserves the legacy behaviour where `profiles.omega_e` and
`profiles.omega_i` are used as-is (for backward compatibility).
### `Riccati.jl::solve_inner(::SLAYERModel{:fitzpatrick})`
Replaced `Q_c = ComplexF64(Q)` (raw pass-through) with the Wick-
rotation+conjugate:
Q_c = im * conj(ComplexF64(Q))
Fortran `slayer/growthrates.f:337,340` applies `g_tmp = q_in * ifac`
with `ifac = (0, +1)` (from `sglobal.f:105`). The algebraically
natural Julia port would be `Q_c = Q * im`, but empirically that
gives `Julia_det(Q) = Fortran_det(-Q)` (180° rotation), and
`Q_c = Q * (-im)` gives `Julia_det(Q) = Fortran_det(-conj(Q))`
(Im-axis reflection). The form `im * conj(Q)` substitutes into
Julia's Riccati so that `-conj(Q_c) = im·Q` — matching Fortran's
internal `g_tmp` — and yields identity. Root cause of the residual
Im-axis reflection in Julia's Riccati (suspected: branch selector
in `_riccati_f_initial` large-D vs small-D regime, or in the
asymptotic `W_bound` sign convention) is not yet identified and
is tracked in `~/Desktop/plasma/CTM-processing/CONVENTIONS.md`
§4 TODO. Once found, `Q_c = Q * im` should be restored to match
Fortran's `ifac` literally.
### Upstream fixes that unblocked this
Prior attempts to resolve Julia↔Fortran SLAYER disagreement stalled
on three issues that this patch exposes and resolves cleanly:
1. `equil.config.b0exp` (which the benchmark driver was passing
as `bt`) is a TOML normalization constant (default 1.0, user-
set 2.0), **not** the geqdsk BCENTR. With `bt` now acceptable
as a scalar kwarg, the benchmark driver feeds the geqdsk
BCENTR literal directly; τ_k J/F ratio went from 5.12×
(ε=0.001) / 21.5× (β=0.1) to 1.0009 / 1.0070.
2. `equil.ro` is the GS solver-found axis R, not the geqdsk
RMAXIS header value. The new `R0` kwarg lets the driver
pass the literal so both codes use the same axis reference.
3. Julia's `surface_minor_radius(..., theta=0)` is outboard-
midplane only, not flux-surface-averaged. Fortran STRIDE's
`a_surf` IS flux-surface-averaged. The new `rs_method=:fsa`
aligns the conventions.
After these three plus the Wick-rotation+conjugate, all SLAYER
normalized params agree sub-percent across both test cases and
the coupled-dispersion panels are pixel-level identical between
Julia and Fortran.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erkin scratch buffers
Two performance-motivated changes that came out of the
julia_vs_fortran benchmark work. Both preserve numerical output
exactly (no behaviour change beyond thread-scheduling nondeterminism
in the residual evaluations, and even that is serialised before
cache insertion so the final result set is deterministic).
### `ContourSearchAMR.jl::amr_scan`
Added `parallel = Threads.nthreads() > 1` kwarg and a bulk-eval
helper `_bulk_eval_into_cache!` that:
* partitions the set of Q-values needed this phase into
already-cached vs new (keeps uniqueness),
* evaluates all new points via `Threads.@threads` when
`parallel=true` and more than one Julia thread is available,
* pushes the results into the shared `Dict{ComplexF64,ComplexF64}`
cache serially afterwards so no Dict data races occur.
Used in both the initial nre0 × nim0 coarse-grid phase and in each
refinement pass. The per-call evaluation of `f` (typically a
`MultiSurfaceCoupling` or `MultiSurfaceCouplingFortran` closure) is
thread-safe because each invocation constructs its own per-surface
solver state — the only shared mutable state is the cache, which
the helper handles serially. Deterministic output regardless of
thread count.
On the 100×100 + 4-pass benchmark scan this cut Julia SLAYER AMR
from ~60s to ~15s on an Apple M2 Max (8 threads).
### `GGJ/Galerkin.jl::GalerkinWorkspace` + `_assemble_and_solve!`
Added five preallocated scratch buffers to `GalerkinWorkspace`
(`cell_mat_buf`, `cell_mat_ext_buf`, `cell_rhs_ext_buf`, `ab_buf`,
`rhs_buf`) sized to the max case (`np+1=4`) used at any cell type,
and re-use them via `fill!(buf, 0)` inside the per-cell loop.
Previously each cell called `zeros(ComplexF64, ...)` which
accumulated thousands of MiB of allocations over a full dispersion
scan.
Same numerical output; the cell-matrix sub-slices are explicitly
zeroed before use and smaller cells (e.g. `CT_EXT` with
`cell.np=1`) rely on the remaining buffer elements staying zero
from the previous `fill!` call.
Measured on the TJ ε=0.001 benchmark (nx=256, cutoff=20, tol_res=1e-7,
msing=2): Galerkin det evaluation dropped from ~4.2 MiB allocs / call
to ~30 kiB / call, with a corresponding 20-25% wall-time reduction
in the GGJ AMR scan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ar coupled residual
`MultiSurfaceCouplingFortran` (aka the 4m×4m Pletzer-Dewar tearing+
interchange dispersion matrix, port of Fortran `rmatch/match.f::match_delta`
fulldomain=0 branch) was adding `+ sc.dc` to BOTH the inner-layer
interchange and tearing Δ channels before assembling the coupled matching
block:
# CoupledFortranMatch.jl, before:
delta1 = resp.interchange * sc.scale + sc.dc # WRONG
delta2 = resp.tearing * sc.scale + sc.dc # WRONG
The code comment claimed this was "per the Fortran convention (χ_parallel
shift that acts on the outer diagonal before matching)." That is NOT in
Fortran — `match.f:508-519` assembles the fulldomain=0 block directly from
the raw `delta1 = deltar(ising, 1)` / `delta2 = deltar(ising, 2)` with no
Δ_crit offset anywhere:
! Fortran match.f (fulldomain=0):
delta1 = deltar(ising, 1)
delta2 = deltar(ising, 2)
mat(idx3, idx3) = -delta1
mat(idx3, idx4) = delta2
mat(idx4, idx3) = -delta1
mat(idx4, idx4) = -delta2
The Δ_crit proxy represents a slab-layer χ_parallel-matching correction
and is meaningful only for tearing-only models like SLAYER (which drops
the interchange channel and needs a proxy for the missing Glasser/
Mercier stabilization). GGJ's 4m×4m Pletzer-Dewar matching already
includes the interchange channel explicitly (`resp.interchange`), so
adding `sc.dc` double-counts that physics.
### Fix
1. `CoupledFortranMatch.jl:179-180`: drop `+ sc.dc` on both channels.
delta1 / delta2 are now the raw inner-layer outputs, matching
match.f:508-519 bit-for-bit.
2. `SurfaceCoupling.jl`: remove the `dc::Real=0.0` kwarg from
`surface_coupling(model::GGJModel, ...)`. The SLAYER and generic
overloads still accept it — SLAYER genuinely needs it for its
slab-layer Δ_crit subtraction. The `SurfaceCoupling.dc` struct field
is hard-wired to 0 for GGJ callers, making the API reflect the
physics.
### Tests
- `test/runtests_dispersion_coupled.jl`: 42 / 42 pass
- `test/runtests_dispersion_residual.jl`: 20 / 20 pass
(Both test files construct `surface_coupling(GGJModel, ...)` with
positional args only — no call sites broken.)
### Impact
For the julia_vs_fortran benchmark, this is a no-op when the driver was
already passing `dc=0.0` for GGJ (the safe default we settled on earlier
in the session). The fix prevents the footgun of anyone else accidentally
passing a nonzero `dc` to a GGJ coupling and getting physically wrong
results.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… and v1
GGJ:
- LayerInputs.jl: changed `v1 = 1.0` placeholder to
`v1 = rg.v1_local / equil.params.volume`. This is the dV/dψ
normalization that `rescale_delta` consumes as `v1^(2*p1)` to
convert raw Galerkin Δ to outer-region matching units. Matches
Fortran resist.f:144 (`sing%restype%v1 = v1/volume`) and match.f:1078
(`deltar = deltar * sfac**(2*p1/3) * v1**(2*p1)`). Previously, on
realistic shaped equilibria where v1_local/volume != 1, Julia's GGJ
Δ disagreed with Fortran by `(v1_local/volume)^(2*p1)`. Analytical
TJ/Solovev cases hid the bug because v1_local/volume happens to
hover near unity there.
SLAYER:
- LayerInputs.jl: changed `dr_val = 0.0` default to `dr_val = nothing`.
When `nothing` is passed, build_slayer_inputs auto-derives the
per-surface resistive interchange index `D_R = E + F + H²` from
`sing.restype` (already populated by `resist_eval_all!`). Without
this, the slayer_panels benchmark driver was reading a scalar
dr_val=-0.1 from a Fortran namelist and applying it uniformly to
every surface, producing dc_tmp values that didn't match Fortran's
per-surface STRIDE-derived values. With `nothing` default, dc_type
in {:lar, :rfitzp, :toroidal} now produces a non-zero per-surface
dc_tmp without manual configuration. dgeo_val behaves analogously
but errors clearly if dc_type=:toroidal is requested without an
explicit value (auto-derive needs ⟨|∇ψ|²⟩ FSA which isn't yet
exposed in ResistGeometry — TODO).
NOTE on Fortran/STRIDE divergence: Julia uses D_R correctly per
Connor-Hastie-Helander 2015 (PPCF 57 065001) Eq. 59. Fortran STRIDE
has a one-character bug in stride_netcdf.f:100 — `dr_rationals(i) =
locstab%f(1)/respsi` uses index 1 (= D_I, the Mercier criterion)
instead of index 2 (= D_R, the resistive interchange). Julia and
Fortran will therefore disagree on dc_tmp magnitude by ~D_I/D_R per
surface (~3-4× on DIII-D) until that upstream Fortran bug is fixed.
The disagreement is documented at the build_slayer_inputs docstring.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rowth_rates Adds `pole_threshold_adaptive::Bool = false` to SLAYERControl. When true, `run_slayer_from_inputs` overrides `control.pole_threshold` per scan with `|mean(Δ)|` over the dispersion-residual array before calling `find_growth_rates`. Backward-compatible (default false uses the literal `pole_threshold`). Justification: the hardcoded default `pole_threshold=10.0` is too restrictive when |Δ| spans 8+ orders of magnitude (typical for SLAYER coupled-dispersion scans). All intersections then get classified as poles and zero roots are returned. The adaptive recipe — empirically matching the Python `10·median(|Δ|)` heuristic and the omfit `|mean(Deltas_AMR)|` recipe — yields the correct root identification on the DIIID benchmark and TJ βₚ scan cases (verified at βₚ=0.1 coupled_rfitzp: 6 roots / 8 poles vs 0 roots with the static threshold). Plumbing changes: - Control.jl: new field + docstring - HDF5Output.jl: written to /slayer/settings/pole_threshold_adaptive - run_slayer.jl: `_pole_threshold_for(scan)` closure dispatches per-scan - Runner.jl: import Statistics.mean
… default 1 (serial) eliminates DIII-D 147131 thread-race The parallel BVP path in `parallel_eulerlagrange_integration` was always invoking `Threads.@threads :static` over the FM chunks, ignoring the `parallel_threads` field on `ForceFreeStatesControl`. On numerically delicate equilibria (e.g. DIII-D 147131 at βₚ ≈ 0.07) this exposed a sub-tolerance nondeterminism: chunk crossings whose post-jump matrices depend on the order of independent FP operations across threads, producing intermittently divergent FM matrices and intermittent BVP failures. The algorithm is correct; the wall-time interleaving of parallel chunks was perturbing it within tolerance. Fix: * `Riccati.jl`: branch on `bvp_threads = clamp(parallel_threads, 1, julia_nthreads)`. `bvp_threads == 1` runs the chunks serially on the calling thread (race-free, bit-deterministic). Otherwise, the existing `:static` parallel path is used. * `ForceFreeStatesStructs.jl`: document `parallel_threads` semantics, default `1`, and the cost (~14% slower than 2-thread on DIII-D 147131 reference). Verified: with `parallel_threads = 1` (default) and `JULIA_NUM_THREADS = 2`, the DIII-D 147131 βₚ=0.07 reference Δ' diagonal matches CONVENTIONS.md §6 exactly: q=2: +7.92 - 0.03i q=3: -5.24 - 0.30i q=4: -40.20 + 209.91i q=5: +126.6 - 169.24i in 54.5 s wall (single 4-singular-surface coupled BVP). No regressions on TJ. Production scans should keep the default; users with robust equilibria and strict wall-time budgets can opt in to `parallel_threads > 1` knowing the trade-off. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…BVP speedup; bit-identical Δ' in 15-trial DIII-D 147131 sweep)
Empirical reliability sweep on DIII-D 147131 βₚ≈0.07 (5 trials at each of
parallel_threads ∈ {1, 2, 4}, JULIA_NUM_THREADS=4, post-JIT, single Julia
session) showed:
parallel_threads | wall (avg, single 4-singular-surface coupled BVP)
-----------------|-------------------------------------------------
1 (serial) | 9.25 s — bit-deterministic by construction
2 | 7.37 s — bit-identical Δ' in all 5 trials (+20.3%)
4 | 7.51 s — bit-identical Δ' in all 5 trials (+18.9%)
Δ′ diagonals were bit-identical across all 15 trials and matched the §6
reference values exactly. Speedup saturates at 2 threads — the BVP has
~10 FM chunks, so 2 threads is enough to amortize them; 4 adds scheduling
overhead with no benefit on this BVP.
Bumping default to 2 captures the ~20% wall-time win on production scans.
The serial path remains available (`parallel_threads = 1`) as a deterministic
fallback if the historical intermittent race re-manifests on a delicate
equilibrium. Documentation in `ForceFreeStatesControl` docstring updated to
record the trade-off and the empirical reliability data.
Use `parallel_threads = 1` (NOT `use_parallel = false`) if a parallel run
ever diverges — `use_parallel = false` produces silently wrong Δ' values
(see CONVENTIONS.md §7).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…aster)
The Fitzpatrick `riccati_f` ODE is a 1-equation system. The prior code
modeled `W` as a 1-element `Vector{ComplexF64}` with an in-place RHS
(`_riccati_f_rhs!(dW, W, params, x)`); every Rosenbrock stage allocated
fresh `dW` intermediates. Converting `W` to a `ComplexF64` scalar with an
out-of-place RHS removes those per-stage heap allocations and lets stage
updates stay on the stack.
Per-call benchmark (1000 calls, Rodas5P, identical inputs):
vector form: 1.62 ms / call
scalar form: 0.96 ms / call (41% faster)
Signature changes:
_riccati_f_rhs!(dW, W, params, x) -> nothing
--> _riccati_f_rhs(W::Number, params, x) -> ComplexF64
_riccati_f_jac!(J, W, params, x) -> nothing
--> _riccati_f_jac(W::Number, params, x) -> ComplexF64
solve_inner ODE state:
u0 = ComplexF64[W_bound]; ODEFunction{true}(...)
--> u0 = ComplexF64(W_bound); ODEFunction{false}(...)
Solver-agnostic. Rodas5P stays the default. The change works equally well
under any OrdinaryDiffEq stiff solver (Rosenbrock / SDIRK / BDF) since
they all support scalar `u0` via the out-of-place form.
Validation (against the temporary baseline at SLAYER_coupling_paper/
regression_temporary/, 88 TJ records frozen pre-change):
TJ uncoupled_2over1_rfitzp at βₚ=0.001
γ baseline = +4.0552247503e+00 kHz
γ scalar = +4.0551819762e+00 kHz
relative drift = 1.05e-5 (within solver-replacement noise)
TJ coupled_rfitzp at βₚ=0.07 (exercises full BVP path)
γ baseline = -8.1071602485e-03 kHz
γ scalar = -8.1071881463e-03 kHz
relative drift = 3.44e-6
n_valid_roots = 26, n_poles = 27 (exact match to baseline topology)
check_regression.py --dry --scope tj : 88/88 pass (5e-4 abs/rel
tolerance on integrator outputs, exact match on topology fields).
Production wall-time on the coupled-BVP case:
baseline (vector form): ~14 min (slowest of 4 parallel cases per βₚ)
scalar form: ~10 min (~29% reduction)
In contrast to the prior KenCarp4 solver-swap attempt (commit 5a9026a8,
reverted as 2b1e1b0f), which looked like a 38% per-call win in synthetic
tests but came out 17% SLOWER in production, this change shows consistent
gains from per-call benchmark through to full production scan. The reason
the wins translate cleanly: the scalar form makes the existing solver
faster without changing its convergence path or step-control behaviour,
so production characteristics scale linearly from the micro-benchmark.
The companion KenCarp4 swap stays deferred (tracked in todos) until we
have direct production-side per-Q timing instrumentation to understand
the bench/production discrepancy.
Test infrastructure also committed:
profiling/profile_slayer_amr.jl CPU + alloc profile harness
profiling/test_riccati_solver_convergence.jl 7-solver convergence sweep
…tring Empirical finding from Phase 2.5 of the AMR speedup work: sub-percent floating-point differences between ODE solvers cascade through the AMR's zero-crossing flagging and produce structurally different cell trees, not just numerically-noisy Δ values. Concrete observation on TJ coupled_rfitzp at βₚ=0.07 under the scalar ODE form (commit b17e0b43): Solver SLAYER wall γ valid_roots poles Rodas5P ~10 min -8.107e-3 kHz 26 27 KenCarp4 ~9 min -8.107e-3 kHz 43 34 KenCarp4 is per-call faster (consistent with the convergence-test results), but its slightly different Δ at AMR cell corners flips many "refine" / "no-refine" decisions and lands on a substantially different final cell list. The most-unstable root (γ) agrees to 2.1e-5 relative, but the inventory of secondary roots and poles differs by ~17 / ~7. Implication: solver swaps are NOT pure per-call optimizations. Future attempts need to be validated against the topology fields (`n_valid_roots`, `n_poles`), not just γ. The temporary regression harness at SLAYER_coupling_paper/regression_temporary/check_regression.py already treats these as exact-match fields, which correctly gates solver swaps. The 92-record baseline serves as a topology fingerprint.
…30% additional per-call speedup)
The Fitzpatrick `riccati_f` ODE coefficients fA, fA', fB, fC use parameters
(Q, Q_e, Q_i, P_perp, P_tor, D_norm, iota_e) that are CONSTANT across the
integration. The prior code recomputed `Q*(Q+iQi)`, `Q+iQe`, `D²·iota_e⁻¹`
etc. at every RHS evaluation — tens of thousands of redundant multiplications
per `solve_inner` call.
This commit lifts the x-independent quantities into a `_RiccatiConsts`
struct built once per `solve_inner` call:
Q_plus_iQe constant part of denom = (Q + iQe + x²)
A = Q · (Q + iQi) fB constant term
B = (Q + iQi)·(P_perp + P_tor) fB · x² coefficient
C = P_perp · P_tor fB · x⁴ coefficient
E = P_perp + (Q + iQi)·D² fC · x² coefficient
G = P_tor · D² / iota_e fC · x⁴ coefficient
The hot RHS (`_riccati_f_rhs`) and Jacobian (`_riccati_f_jac`) now access
only the bundled constants and `x`, doing ~3 muls + 1 division per call
instead of ~10 muls + 2 divisions.
Per-call benchmark (1000 calls, Rodas5P, identical inputs):
prior (scalar form, post b17e0b43): 0.96 ms / call
precompute (this commit): 0.67 ms / call (-30% per call)
cumulative vs vector-form baseline: 1.62 → 0.67 ms (-59%, 2.42× faster)
Validation against the temporary baseline at SLAYER_coupling_paper/
regression_temporary/:
TJ coupled_rfitzp at βₚ=0.07 (full BVP path)
γ baseline = -8.1071602485e-03 kHz
γ precompute = -8.1071881463e-03 kHz
relative drift = 3.44e-6 (same as scalar-only Phase 2.3 baseline)
n_valid_roots = 26, n_poles = 27 (exact match to baseline topology)
check_regression.py --dry --scope tj : 88/88 pass
Production wall on TJ coupled_rfitzp at βₚ=0.07:
vector-form baseline: ~14 min
scalar form (Phase 2.3): ~10 min
scalar + precompute: ~9 min (~36% cumulative reduction)
The active SLAYER step alone is now ~41% faster than baseline. Production
wall scales sub-linearly because main() / find_growth_rates / file-write
overheads remain unchanged.
Implementation note — algebraic simplification rejected:
A natural further optimization is `fA' = 1 − 2·fA` (algebraic identity:
(denom − 2p²)/denom = 1 − 2·(p²/denom) = 1 − 2·fA). It saves one complex
division per call. However, when tested, the integrator's adaptive
stepping near marginal stability compounded ULP-level differences in fA'
across thousands of steps, producing ~3e-3 relative γ drift versus this
form's 3e-6. The drift was within the regression's abs-tolerance gate but
still a real precision regression. Reverted — kept the explicit
`(denom − 2·p²)/denom` form, which preserves bit-identical Δ at warm
benchmark points vs the scalar-form baseline.
…tion kwargs
Two additive kwargs to support convergence-vs-resolution studies and
graceful behaviour when the cell-count safety rail is hit:
snapshot_callback::Union{Nothing,Function} = nothing
If provided, called at the end of each AMR pass (and once for the
initial grid, pass=0) with arguments
(pass::Int, cells::Vector{AMRCell}, cache::Dict{ComplexF64,ComplexF64}).
The callback receives live references; copy if persistence is needed.
Used by convergence studies to extract intermediate γ at each pass
count from a SINGLE AMR run (avoids re-running for every target pass).
max_cells_action::Symbol = :error
:error (default, prior behaviour) raises when length(cells) > max_cells.
:warn_truncate logs a @warn, stops further refinement in the current
pass, and exits the outer pass loop — leaving a usable AMRResult with
the partial cell tree. Useful for resolution-sweep studies that
deliberately push max_cells to bound runtime.
Backward compatibility: defaults preserve the exact prior behaviour.
Validated via regression rerun of TJ coupled_rfitzp at βₚ=0.07
(88/88 pass, γ + topology bit-identical to pre-change baseline).
… passes) study Driver for the Phase 2.8 convergence study, sweeping AMR initial-grid resolution and refinement-pass counts to identify the cheapest (nre0, passes) tuple that hits a γ-convergence target. Uses the new `snapshot_callback` kwarg (commit f59dcaee) so a SINGLE AMR run captures γ at every intermediate pass count — avoiding 4× the runs that re-running per pass would require. Sweep on TJ coupled_rfitzp at βₚ=0.07, three SLAYER configurations on the same equilibrium (q=2 uncoupled, q=3 uncoupled, full coupled), Q_HW=±25 kHz, max_cells=1M with `:warn_truncate` graceful early-stop: case γ_ref(200,5) min (nre0, pass) AMR wall uncoupled_2over1 -0.03793 kHz (25, 4) 40 s uncoupled_3over1 -0.13069 kHz (25, 3) 46 s coupled -0.00816 kHz (25, 5) 187 s Convergence target: |γ - γ_ref| < max(5e-5, 0.005·|γ_ref|). Key finding: AMR wall scales primarily with INITIAL grid size (nre0²), not pass count. The (25, 8) config is FASTER than (200, 5) — starting from a coarse grid and refining further is cheaper than starting fine and stopping sooner, because per-pass work scales with the current cell count which grows from a smaller base. Recommendation for production defaults: uncoupled (any): nre0 = 25, max_passes = 4 coupled: nre0 = 25, max_passes = 5 Compared to current production defaults (nre0=100, passes=4-5), this gives an additional ~10-20% wall reduction on top of the per-call optimizations from Phase 2.3 / Phase 2.7. Plots committed externally: /tmp/convergence_curves.png γ vs pass per case (4 nre0 lines) /tmp/convergence_resolution.png γ at max_pass vs nre0 (3 case lines)
…creen for active boxes
Adds `multi_box_amr_scan` to ContourSearchAMR.jl: run `amr_scan` over multiple
Q-plane boxes with a coarse pre-screen step that skips inactive boxes
entirely. Motivated by the three-stripe ω-axis scan for SLAYER coupled
dispersion: rather than refining one wide ±75 kHz × ±25 kHz box, we split
into three 50 kHz × 50 kHz stripes (centred on the γ=0 axis) and only run
the AMR on stripes that show activity.
A box is flagged ACTIVE if any pre-screen cell satisfies AT LEAST ONE of:
- sign change in Re(Δ) across its 4 corners (zero-isoline of Re(Δ) crosses
the cell — root candidate);
- sign change in Im(Δ) across its 4 corners (root candidate);
- any corner with |Δ| ≥ pole_magnitude_threshold (likely pole — sign-only
criteria miss tight poles whose fringe doesn't straddle a corner).
The pole-magnitude criterion is essential for catching poles tucked inside a
pre-screen cell that happens to sample the same sign-lobe at all four corners.
Default pre-screen resolution is 25×25, matching the typical AMR initial
grid — coarser misses small features; finer wastes evaluations on inactive
boxes.
Adds:
- `BoxActivity` enum (`NoActivity`, `ReZeroCrossing`, `ImZeroCrossing`,
`PoleMagnitude`)
- `_check_box_activity` helper (returns first satisfied criterion)
- `MultiBoxAMRResult` struct (per-box `AMRResult` + aggregated
cells/Q/Δ + per-box activity reasons + pre-screen eval count)
- `multi_box_amr_scan(f, boxes; pole_magnitude_threshold, ...)`
- `as_amr_result(::MultiBoxAMRResult) -> AMRResult` for direct
consumption by `find_growth_rates`
Tests added in test/runtests_dispersion_amr.jl (3 new testsets, 19 @test
calls covering: 3-box stripe with zero/pole/empty boxes, sharp-pole
synthetic exercising the magnitude criterion, argument validation).
49/49 dispersion-AMR tests pass.
TODO follow-ups:
- Thread a shared cache through `amr_scan` so pre-screen evals aren't
re-evaluated by the per-box AMR initial pass on active boxes (saves
~676 redundant evals per active box).
- Wire into the SLAYER driver (`Tearing.Runner`) so the user-facing
betascan/diiid/etc. drivers can opt into multi-box layouts without
manual pole_magnitude_threshold tuning.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… via concavity + γ-gap, with secondary-root fallback
The existing `filter_outside_re` gate only triggered when the Re(Δ)=0 contour
was approximately closed at the candidate intersection (closure_gap < 10% of
contour extent). On scans where the spurious upper-branch root sits at the
edge of the Q box (so the Re=0 contour exits the box and is not closed at the
candidate), the gate fell open and the spurious high-γ root was selected as
"least-stable" — producing γ values that visibly exceed the physical eigenmode
cluster (observed on coupled DIII-D 147131 where the algorithm selected
γ=+18.6 kHz instead of the physical γ≈+0.4 kHz).
Adds two new geometric/algorithmic checks that do NOT depend on the Re=0
contour being closed:
- `:geom`: Re(Δ)=0 is locally downward-concave at the candidate AND the
Im(Δ)=0 tangent at the candidate exits at angle > `angle_threshold_deg`
from horizontal (default 45°). The concavity test uses a turn-direction
cross product that's invariant under polyline traversal direction.
- `:gap`: the candidate is unstable (γ > 0) AND its γ exceeds the next
candidate's γ by more than `gap_kHz_threshold` kHz (default 1.0). Flags
"isolated peak" outliers.
Combined into a recursive selection rule (per the user's spec):
- 0 flags → accept candidate as primary, no warning
- 1 flag → accept candidate as primary, raise warning, expose next-down
root as `Q_root_secondary` for downstream review
- 2 flags → reject candidate, recurse into next-most-unstable root
Extends `GrowthRateResult` with `Q_root_secondary` (`ComplexF64`),
`omega_Hz_secondary`, `gamma_Hz_secondary`, and `warning_flags::Vector{Symbol}`.
The legacy `valid_roots`/`poles`/`filtered_roots` fields are unchanged.
New kwargs on the public `find_growth_rates(::ScanResult|::AMRResult)`:
`gap_kHz_threshold=1.0`, `angle_threshold_deg=45.0`. Defaults preserve
behaviour on cases where neither flag fires (verified against existing test
suite — 49/49 dispersion-AMR tests still pass, 33/33 dispersion-scan,
20/20 dispersion-residual).
Empirical validation (rendered side-by-side contour plots saved separately):
DIII-D 147131 uncoupled q=4:
primary γ=-4.540 kHz no warnings ✓ (clean case unchanged)
DIII-D 147131 coupled (msing=4):
primary γ=+18.630 kHz ⚠ [:gap] → secondary γ=+0.418 kHz exposed
The +18.6 root is a spurious high-γ outlier (Re=0 contour exits the
γ=+25 kHz box edge, so the legacy outside_re gate falls open). The
new `:gap` check catches it (Δγ from next root = 18.2 kHz >> 1 kHz)
and surfaces the physical +0.42 root as the secondary — matching
visual inspection of the contour plot.
The geom check did not fire on the coupled DIII-D case (Re=0 geometry near
the +18.6 candidate is more vertical than concave-down on this triangulated
AMR mesh). That's the by-design behaviour: a single flag still leaves the
primary as primary, with the secondary surfaced for the operator to
review. A test case that exercises the concavity path is a TODO.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + density flag (3-of-N spurious-root recursion)
Refines the spurious-root detection in `_run_analysis` based on validation
against the DIII-D 147131 coupled case. Two algorithmic improvements:
1. **Polyline-walk concavity (replaces 3-vertex stencil)**
The previous geom check used only the 3 vertices immediately adjacent
to the candidate's closest Re=0 vertex. On AMR-triangulated meshes the
Re=0 contour is fragmented into ~10⁴ short polylines, so 3 consecutive
vertices span a single segment — local turn-direction noise dominates
the macroscopic shape and the test failed to fire on cases the user
could clearly identify as "downward-concave hills" by eye.
New `_is_geom_spurious` walks outward from the closest Re=0 vertex
along the actual polyline, collecting consecutive vertices within
`max_walk` Q-distance of the candidate. It then fits a local quadratic
γ = a + b·Δω + c·Δω² and reports `c < 0` (concave-down hill).
Crucially, the test gates on FIT QUALITY: only flags when the RMS
residual / γ_spread is below `quality_threshold` (default 0.15),
so noisy / multi-feature regions correctly produce no flag.
Verified on the DIII-D 147131 coupled HDF5: at the spurious +18.6
candidate, the polyline walk at max_walk=0.5 Q gives c=-4.96 with
RMS/γ_sp=0.10 → CLEANLY flags spurious; at the legitimate +0.41
candidate the fit is noisy (RMS/γ_sp=0.33) so no flag is raised.
2. **Density flag (`:density`) — clustering as a green-flag for validity**
New `_is_density_isolated` counts other valid roots within
`density_radius_Q` of each candidate. Spurious high-γ outliers tend
to be isolated in Q-space; legitimate coupled-tearing roots cluster
densely in the resonant region. Disabled when `n_total < 5` (the
user's clustering heuristic only carries signal when there's a
cluster baseline to be missing from — uncoupled cases with 1-3
total roots would otherwise spuriously fire on every candidate).
3. **Recursion rule extended to 3-flag voting**
`:geom` + `:gap` + `:density`: discard candidate if 2+ flags fire,
else accept as primary with single-flag warning recorded.
Empirical outcome on existing HDF5s (re-extracted via /tmp/reextract_all.jl):
DIII-D 147131 uncoupled q=4 (n_roots=3, density auto-disabled):
primary γ=-4.540 kHz warn=[:geom] γ_2nd=-5.557 kHz
Same physical primary as before, with a single geom warning surfacing
a nearby root for review. (The geom flag firing here is borderline —
the local Re=0 fit happens to land concave-down on the AMR mesh
even though the global structure is well-like; the recursion
correctly keeps it as primary because it's the only flag.)
DIII-D 147131 coupled (n_roots=37):
primary γ=+0.411 kHz warn=[:density] γ_2nd=-0.481 kHz
The spurious +18.6 root is now correctly DISCARDED by the recursion
(it accumulates 2+ flags from {geom, gap, density}). The +0.41
root that was previously surfaced only as `secondary` is now the
primary. This brings `filter_outside_re=true` (default) and
`filter_outside_re=false` to the same answer on coupled DIII-D —
the new geom + density logic obviates the need to manually toggle
the legacy gate.
New kwargs on the public `find_growth_rates(::ScanResult|::AMRResult)`:
`density_radius_Q=0.5`, `min_neighbors=2`. Defaults are conservative —
density only fires when truly isolated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…geom + :gap (back to 2-flag recursion)
The user flagged that :gap and :density could both falsely fire on a
legitimate isolated mode (e.g. an uncoupled case with one dominant unstable
root and one stable root separated by > 1 kHz), causing the recursion to
incorrectly discard the right answer. Removed:
- `_is_density_isolated` helper
- `density_radius_Q`, `min_neighbors` kwargs (from public + private API)
- the per-candidate density check in `_run_analysis`
Recursion rule reverts to the simpler "discard if BOTH :geom and :gap fire"
(which on the validation cases is sufficient to catch the +18.6 kHz
spurious in DIII-D 147131 coupled — the polyline-walk concavity fix from
3dd65e83 cleanly fires :geom on that candidate, and the >1 kHz γ-gap
fires :gap, so both flags accumulate and the recursion discards it).
Empirical re-extraction (without density):
DIII-D 147131 uncoupled q=4 (n_roots=3):
primary γ=-4.540 kHz warn=[:geom] γ_2nd=-5.557 kHz
Same as before — the lone :geom warning is informational; the
primary is correctly the legitimate root.
DIII-D 147131 coupled (n_roots=37-38):
primary γ=+0.411 kHz warn=[] γ_2nd=NaN (no warnings — clean!)
The +18.6 spurious is still correctly DISCARDED by [geom + gap]
both firing. The legitimate +0.41 root is now reported with NO
warnings — cleaner than the [:density] warning we previously
surfaced. Better signal-to-noise: a warning now means
"geometrically suspicious AND isolated peak", which is a strong
signal worth alerting on.
Tests still 102/102 passing across runtests_dispersion_{amr,scan,residual}.jl.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d_deg parameter + cleanup The `angle_threshold_deg` kwarg was a leftover from the earlier `_is_geom_spurious` formulation that combined "Re=0 concave-down + Im=0 exit angle > 45°" into a single test. After the polyline-walk refactor (e97225c) the concavity check became standalone (with its own RMS-residual quality gate), and the angle term was no longer consulted — but the parameter was still plumbed through every API layer. Removes the parameter + its docstring + every plumb-through site: - Public `find_growth_rates(::ScanResult, ::Real; …)` and `(::AMRResult, …)` - Private `_extract_growth_rates`, `_extract_growth_rates_amr`, `_run_analysis` - `_is_geom_spurious(pt, re_paths)` now takes only what it actually uses (no more `im_paths` or `angle_threshold_deg` placeholders) Also drops the dead-code-removal comment about `_is_density_isolated` — the explanation lives in the commit message of 4c6fbe3 (which removed it). The file is now clean of historical references to features that no longer exist. Tests still 102/102 across runtests_dispersion_{amr,scan,residual}.jl. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ole_threshold + gap_kHz_threshold plumbing
Three production-default improvements informed by the DIII-D 147131 + TJ
betascan validation work:
1. **Pole threshold default → 10 × median(|Δ|)** (was `|mean(Δ)|`)
The mean-of-complex-residuals collapses on oscillating dispersions
whose phases cancel in the complex sum (saw 226 vs 439 on DIII-D
coupled), and is also inflated by single near-pole pre-screen samples.
`10 × median(|Δ|)` reflects "10× the typical residual magnitude" and
is robust to both pathologies. Applied in `_pole_threshold_for` inside
`run_slayer.jl`. Old behaviour was the only code path; new default is
strictly an improvement on the validation cases.
2. **`SLAYERControl.boxes`** — multi-box stripe scan field (default empty).
When non-empty, `_run_scan` dispatches to `multi_box_amr_scan` instead
of single-box `amr_scan`. Each entry is `(omega_lo, omega_hi, gamma_lo,
gamma_hi)` in dimensionless Q-units. Activity criteria use
`pole_magnitude_threshold = 10 × median(|Δ|)` derived from a coarse
16×6 sample of the union of all boxes (matches the
validate_multi_box.jl driver). `multi_box_prescreen_n=25` controls the
per-box pre-screen grid resolution. Backward-compatible — production
scans that don't set `boxes` see the existing single-box behaviour.
3. **`SLAYERControl.gap_kHz_threshold`** — exposed (default 1.0 kHz) and
forwarded to the new `find_growth_rates` `:gap` flag. Lets per-case
TOML configs tune the spurious-isolated-peak threshold without code
changes.
Tests: 49+33+20+61 = 163 pass across runtests_dispersion_{amr,scan,residual}.jl
+ runtests_slayer_runner.jl.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dge artifacts
direct_position! used Roots.Brent() on the full (axis, rmin) and (axis, rmax)
brackets to locate the inboard/outboard LCFS positions. Brent requires
opposite-sign endpoints — fine for diverted equilibria where renormalized
ψ stays negative from the LCFS out to the (R, Z) box edges.
Fixed-boundary equilibria (e.g. TokaMaker free/fixed-boundary geqdsk output)
violate this assumption: ψ outside the LCFS can have a thin spurious-
extrapolation ring near the box edge where it re-crosses zero, leaving the
(axis, rmin) and (axis, rmax) brackets with same-sign endpoints. Brent then
raises "ArgumentError: The interval [a,b] is not a bracketing interval"
even though the physical LCFS DOES exist inside the bracket.
Fix: pre-scan ψ outward from the magnetic axis with n_scan=200 uniform steps
and locate the FIRST sign change, then run Brent on that sub-bracket. The
first crossing from the axis is the physical LCFS, so behavior is identical
to before on diverted equilibria but robust to fixed-boundary edge artifacts.
Errors with a clear message if no sign change is found in the scan window.
Verified:
- 79/79 q95 TokaMaker fixed-boundary geqdsks load (previously all failed
on the inboard bracket)
- DIII-D 147131 diverted X-point still loads unchanged
- shaped_beta_scan synthetic geqdsks still load unchanged
- SLAYER_coupling_paper/coupled_deltacrit_q95scan full-pipeline smoke test
(coupled_n=1 with rfitzp Δ_crit, pc=1.001) passes end-to-end through
GPEC main + Force-Free States BVP + SLAYER multi-stripe AMR
Empirical finding from the SPARC β-scan kink-approach diagnostics:
the geom + gap "spurious upper-branch" detector was too aggressive in
the kink-approach regime where valid roots become sparse (only 4-5
candidates per scan, 2-3 kHz γ separation between primary unstable and
next-stable roots). Concrete failure case:
shaped_beta_scan / coupled_n2_rfitzp / β_N=2.7502
valid root at (ω=−22.67, γ=+0.088) — flagged BOTH :geom and :gap
pre-2026-05-08: discarded → fell back to (+9.34, −2.596)
post-2026-05-08: warned but kept; chosen as primary (γ=+0.088)
Change in GrowthRateExtraction.jl: drop the discard branch when
both :geom and :gap fire. Always accept candidate, push warning(s)
to warning_flags, and let downstream tools (post-hoc smoothness
override in plot_betascan.py:apply_chooser_overrides) handle the
trajectory continuity check.
Empirical impact on the shaped_beta_scan / pubrun_050526:
- 7 of 8 affected (case, β_N) pairs now choose correctly without any
post-hoc override (chooser_overrides count: 9 → 2).
- 1 regression: 3/2 rfitzp at β_N=2.8501 — the previously-available
smooth-trend candidate (-21.4, -0.241) is no longer in valid_roots
on the new run (suspected pole reclassification at the unchanged
pole_threshold check that runs BEFORE the geom/gap check). Net
effect on the publication 4-panel γ figure: minimal (1 trace point
out of ~340 plotted).
Control.jl: minor parameter plumbing for the new policy.
Status: WIP — not yet validated on q95scan, IBS_AT_scan, or DIIID
benchmarks. Filtered_roots subgroup in HDF5 output now records
LEGACY-rejected roots only (the old above-pole + outside-Re branch);
geom/gap-warned roots appear in valid_roots with their flags.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds tearing-mode growth-rate analysis as a new
Tearingumbrella module undersrc/Tearing/, with three layers:SurfaceCoupling,MultiSurfaceCoupling,CoupledFull(2m×2m det(D′−D(γ))), andCoupledFortranMatchresiduals; AMR Q-plane scan + triangulation-based growth-rate extraction; spurious-root detection (geom + γ-gap + polyline concavity)Tearing.Runner.run_slayer), kinetic-profile loading, HDF5 output.Also includes the supporting work that landed alongside:
ForceFreeStatesviadelta_prime_raw/pest3_decomposeRiccati.jlsolver inForceFreeStatesfor the ideal plasma response (with ~30–40% perf work)Utilities:NeoclassicalResistivity,KineticProfiles,PhysicalConstantssolovev_slayer_n1,tj_epsilon_pole73 commits, ~+13.9k / −0.75k LOC across 92 files.
Relationship to
perf/riccatiThe inner-layer growth-rate solvers here consume the outer-region Δ′ produced by the Riccati-based ideal-MHD solver on
perf/riccati. This branch is designed to subsumeperf/riccationce that work merges: it currently shares a common ancestor (c6c845ff) and is 39 commits ahead but 11 commits behindperf/riccati. Plan is to merge the final state ofperf/riccatiin here once it's ready to merge and before this PR moves out of draft, so the final history includes the outer-region Δ′ improvements that the dispersion residuals depend on.Current status — WIP / draft
528062f8) changeschooser_overridesfrom discard-on-(geom ∧ gap) to warn-and-keep, which fixed 7/8 mis-chosen roots in the SPARC β-scan; not yet validated against q95-scan, IBS_AT-scan, or DIII-D 147131 benchmarks.valid_rootswith flags.perf/riccati.