Adds initial support for the Kamino solver#5214
Adds initial support for the Kamino solver#5214AntoineRichard merged 26 commits intoisaac-sim:developfrom
Conversation
Greptile SummaryThis PR adds initial support for the Kamino solver (a Proximal-ADMM based constrained rigid-body dynamics solver) to IsaacLab-Newton, including a Confidence Score: 5/5Safe to merge — no runtime defects found; the one previous blocking concern (dynamics_preconditioning) is fully resolved. All remaining findings are P2 style suggestions (dead config fields in two Ant env files). The core Kamino integration — config factory, deferred FK, CUDA graph warmup, and collision detector branching — is correct. direct/ant/ant_env_cfg.py and manager_based/classic/ant/ant_env_cfg.py have cosmetic dead fields worth cleaning up, but they do not affect runtime behaviour. Important Files Changed
Sequence DiagramsequenceDiagram
participant Env as RL Environment
participant NM as NewtonManager
participant SK as SolverKamino
participant Graph as CUDA Graph
Note over NM: initialize_solver()
NM->>SK: SolverKamino(model, cfg.to_solver_config())
NM->>Graph: capture _simulate_physics_only()
NM->>Graph: warmup wp.capture_launch() [pins lazy allocations]
Note over Env,NM: Episode reset
Env->>NM: write_joint_state / write_root_state
NM->>NM: invalidate_fk() → _kamino_needs_fk = True
Note over NM: step()
NM->>SK: _forward_kamino() [solver.reset() for FK]
NM->>NM: _kamino_needs_fk = False
NM->>Graph: wp.capture_launch(_graph)
Graph->>SK: solver.step(state_0, state_1, control, contacts, dt)
NM->>NM: _mark_transforms_dirty()
NM->>NM: sync_transforms_to_usd() [at render cadence]
Reviews (2): Last reviewed commit: "Move Kamino config mapping to KaminoSolv..." | Re-trigger Greptile |
| cls._needs_collision_pipeline = not use_kamino_collision_detector | ||
| else: | ||
| cls._needs_collision_pipeline = True |
There was a problem hiding this comment.
dynamics_preconditioning is not exposed in KaminoSolverCfg
cfg_dict.get("dynamics_preconditioning", True) silently falls back to True because KaminoSolverCfg has no dynamics_preconditioning field. Users cannot override this parameter. Either add the field to KaminoSolverCfg:
dynamics_preconditioning: bool = True
"""Whether to use preconditioning in the constrained dynamics solver."""or document that it is intentionally fixed at True.
There was a problem hiding this comment.
Review: Kamino Solver Integration
Solid integration of the Kamino P-ADMM solver. The architecture decisions are sound — the FK/reset split between forward() (generic eval_fk for immediate kinematics) and _forward_kamino() (full solver state reinit deferred to step()) is a clean design for maximal-coordinate solvers. The CUDA graph warm-up for Kamino's pool-allocated buffers is also well-handled in both the standard and deferred capture paths.
The Greptile bot flagged some issues from a previous revision (debug_mode crash, changelog version) that appear to already be fixed in the latest commit — nice.
Observations
1. Default tolerances vs preset tolerances (newton_manager_cfg.py) — KaminoSolverCfg defaults are 1e-6 for PADMM tolerances, but every shipped preset uses 1e-4. Users who create a bare KaminoSolverCfg() will get a solver that's 100× tighter than any tested configuration, potentially hitting the 200-iteration ceiling without converging. Consider either aligning defaults to 1e-4 or adding a note in the docstring.
2. padmm_use_graph_conditionals=False in all presets — This unrolls all padmm_max_iterations (100) iterations in every CUDA graph replay, even when convergence happens much earlier. Is Kamino's wp.capture_while path tested with CUDA graph capture? If so, enabling graph conditionals for at least some presets could meaningfully improve throughput.
3. Near-identical preset configs — All 7 Kamino presets share the same parameters (only sparse_jacobian and num_substeps differ). Consider extracting shared defaults into a factory function or constant dict to reduce duplication and ensure consistent updates when tuning:
def _kamino_preset(num_substeps=2, sparse_jacobian=False, **kw):
return NewtonCfg(
solver_cfg=KaminoSolverCfg(integrator="moreau", use_collision_detector=True,
sparse_jacobian=sparse_jacobian, constraints_alpha=0.1,
padmm_max_iterations=100, padmm_primal_tolerance=1e-4,
padmm_dual_tolerance=1e-4, padmm_compl_tolerance=1e-4,
padmm_rho_0=0.05, padmm_eta=1e-5, padmm_use_acceleration=True,
padmm_warmstart_mode="containers",
padmm_contact_warmstart_method="geom_pair_net_force",
padmm_use_graph_conditionals=False,
collision_detector_pipeline="unified",
collision_detector_max_contacts_per_pair=8, **kw),
num_substeps=num_substeps, debug_mode=False, use_cuda_graph=True)4. _forward_kamino() runs eagerly outside CUDA graph — Correct by design (resets aren't graphable), but frequent mid-episode resets will bypass the graph path. Consider adding a logger.debug in the _kamino_needs_fk branch so operators can diagnose unexpected performance drops from frequent resets.
5. get_solver_convergence_steps() remains MuJoCo-only — The call-site guard is correct, but if Kamino adds its own metrics API (referenced in comment), consider making this method dispatch by solver type rather than only guarding the caller.
Minor Nits
collision_detector_pipelinedocstring says"primitive"or"unified"— is"primitive"actually supported by Kamino'sCollisionDetectorConfig, or is that Newton-pipeline-only?KaminoSolverCfg.solver_typedocstring saysCan be "kamino"— aLiteral["kamino"]type annotation would enforce this at config validation time.
Overall this is clean, well-documented work. The config class docstrings are thorough, the CUDA graph handling covers both standard and deferred-RTX paths, and the reset logic correctly handles the maximal-coordinate solver's needs. 👍
fed32ea to
8abebd2
Compare
|
|
||
| # Kamino allocates internal state arrays (joint_q_prev, body_f_total, | ||
| # joint_lambdas, FK solver arrays) during graph capture via wp.clone/wp.zeros. | ||
| # These memory-pool allocations must be "pinned" by replaying the graph once | ||
| # before any non-graphed Kamino operations (e.g. solver.reset()) can safely | ||
| # access them. Without this warm-up, pool-allocated addresses may be stale. | ||
| if isinstance(cls._solver, SolverKamino): | ||
| wp.capture_launch(cls._graph) |
There was a problem hiding this comment.
@camevor can you check if this is correct? It looks like we call a solver reset before the first sim step which requires us to do a warm-up step. If we don't do the warm-up step it looks like not everything is initialized.
|
Tip: Greploop — Automatically fix all review issues by running Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal. |
| Kamino's ``joint_q`` / ``joint_u`` include coordinates for **all** joints | ||
| (including free joints), so we pass Newton's full state arrays directly. | ||
| """ | ||
| cls._solver.reset( |
There was a problem hiding this comment.
The full reset without world_mask will have a performance penalty. Kamino does a solver cold-start after a reset. We can think about that later though. Perhaps _kamino_needs_fk needs to be a per world thing?
There was a problem hiding this comment.
Thanks @rubengrandia! Agreed. I already had a chat about this with @camevor we'd have to rework the API a bit to make that happen though.
There was a problem hiding this comment.
There is a pending task/issue to ensure resets will invoke a world-masked cold/warm start so only reset worlds get cold-started. Also, I think we could add a graph-capturable mechanism to ensure that an all-zeros world mask could be used to skip unnecessary resets. I've created an issue for the second aspect (vastsoun/newton#343)
Add _world_reset_mask and _fk_reset_mask to NewtonManager. invalidate_fk() now accepts env_mask/env_ids + articulation_ids to accumulate per-environment dirty state. step() passes masks to eval_fk and solver.reset, then zeros them. Addresses review feedback on PR isaac-sim#5214 (Kamino full-reset performance). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| Kamino's ``joint_q`` / ``joint_u`` include coordinates for **all** joints | ||
| (including free joints), so we pass Newton's full state arrays directly. | ||
| """ | ||
| cls._solver.reset( |
| # Kamino: run solver.reset() with the accumulated world mask to reinitialise | ||
| # internal state (warm-start containers, constraint multipliers) for reset worlds. |
There was a problem hiding this comment.
@rubengrandia do you know if the call here is a no-op if _world_reset_mask is a mask full of zero?
There was a problem hiding this comment.
I think all of the kernels have an early return based on the mask, but there are still a fair bit of kernels launched. Either way I think this would be an optimization to apply inside the Kamino reset.
Add missing Kamino wp.capture_launch warm-up in the deferred RTX-compatible graph capture path to prevent CUDA error 700. Remove unused floating-base stripping infrastructure (_extract_strided_float_array kernel, _setup_kamino_reset_state, and 8 _kamino_* class attributes) since _forward_kamino passes full state arrays directly. Remove invalidate_fk calls from velocity-only write methods since FK depends only on positions. Fix changelog: deduplicate 0.5.7, add 0.5.11 entry, fix stale notify_state_written reference, bump extension.toml to 0.5.11.
Add dynamics_preconditioning field to KaminoSolverCfg so the parameter is user-configurable instead of silently hardcoded. Remove misleading changelog "Changed" entry for use_fk_solver default since KaminoSolverCfg is entirely new in this PR. Add isaaclab_tasks changelog entry for Kamino physics presets, bump extension.toml to 1.5.20. Guard debug convergence block with isinstance(SolverMuJoCo) check to prevent AttributeError when debug_mode=True with Kamino solver.
Update FK flag comment to say "consumed by step" (not step/forward). Replace "MuJoCo-Warp scratch buffers" with "solver scratch buffers" since _capture_relaxed_graph is now solver-agnostic.
Extract the 55-line config-dict-to-nested-dataclass mapping from NewtonManager.initialize_solver into a to_solver_config method on KaminoSolverCfg. The manager now constructs the solver in one line. Restore Kamino CUDA graph warm-up with corrected comment explaining the root cause: StateKamino.from_newton lazily allocates arrays via wp.clone/wp.zeros during the first graphed step. Switch ant env configs to use Newton's collision pipeline (use_collision_detector=False) for Kamino preset.
Remove from humanoid (D6/GIMBAL joints not supported by Kamino) and anymal_c (direct needs RayCaster/Kit, manager-based needs contact sensor preset swapping for kitless mode). Kamino presets remain on cartpole, ant, and reach envs.
This reverts commit 1f3a1ea.
Add kamino entries to VelocityEnvContactSensorCfg and AnymalCEventsCfg preset classes so contact sensors and events resolve to Newton variants in kitless mode. Add kamino armature preset for anymal_c legs actuator. Add Kamino physics preset to manager-based anymal_c flat env. Remove Kamino presets from humanoid (D6/GIMBAL joints unsupported) and direct anymal_c (RayCaster requires Kit).
Revert contact sensor, events, and armature kamino entries. Anymal_c locomotion envs need more plumbing work to support Kamino in kitless mode.
Map kamino to newton table variant so kitless mode uses the cuboid collision table instead of the USD table that requires Kit.
Add _world_reset_mask and _fk_reset_mask to NewtonManager. invalidate_fk() now accepts env_mask/env_ids + articulation_ids to accumulate per-environment dirty state. step() passes masks to eval_fk and solver.reset, then zeros them. Addresses review feedback on PR isaac-sim#5214 (Kamino full-reset performance). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…thods All 6 leaf write methods now forward their environment mask or index array plus articulation_ids to invalidate_fk() for per-world reset accumulation.
Rigid object pose and velocity write methods now call invalidate_fk() with per-environment masks, ensuring FK and Kamino solver reset are triggered for the affected worlds.
Collection body pose and velocity write methods now call invalidate_fk() with per-environment masks, ensuring FK and Kamino solver reset are triggered for the affected worlds.
The _mask methods delegate to _index methods which already call invalidate_fk. Removed the duplicate calls to avoid double accumulation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
27697ef to
afbddc5
Compare
# Conflicts: # source/isaaclab_newton/config/extension.toml # source/isaaclab_newton/docs/CHANGELOG.rst # source/isaaclab_tasks/config/extension.toml # source/isaaclab_tasks/docs/CHANGELOG.rst # source/isaaclab_tasks/isaaclab_tasks/direct/ant/ant_env_cfg.py # source/isaaclab_tasks/isaaclab_tasks/direct/cartpole/cartpole_env_cfg.py
Resolve changelog conflicts by moving the PR entries into new version headings after the current develop releases.
Description
Adds initial support for the Kamino solver.
Cartpole is training.
Type of change
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there