refactor: canonicalize launcher aliases and extract MAD_MULTI_NODE_RUNNER resolver#132
Open
coketaste wants to merge 1 commit into
Open
refactor: canonicalize launcher aliases and extract MAD_MULTI_NODE_RUNNER resolver#132coketaste wants to merge 1 commit into
coketaste wants to merge 1 commit into
Conversation
…NNER resolver - Add canonicalize_distributed_launcher() to common.py with sglang_disagg → sglang-disagg alias - Replace inline duplicate checks in k8s_template_context.py and slurm.py with the helper - Extract _resolve_local_multi_node_runner_env() method in container_runner.py; self-managed launchers (vllm/sglang/sglang-disagg/primus) now always set MAD_MULTI_NODE_RUNNER="" so downstream scripts under set -u don't fail - Add test coverage for new helper and extracted method, including sglang_disagg alias path Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR refactors launcher alias handling and Docker-local MAD_MULTI_NODE_RUNNER resolution so distributed launcher names are canonicalized consistently and local runs get a defined runner environment variable.
Changes:
- Adds
canonicalize_distributed_launcher()with an alias map forsglang_disagg→sglang-disagg. - Extracts Docker-local multi-node runner env resolution into
_resolve_local_multi_node_runner_env(). - Adds unit coverage for launcher canonicalization and local runner command/env generation.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/madengine/deployment/common.py |
Adds shared launcher alias canonicalization helper. |
src/madengine/deployment/slurm.py |
Uses canonicalization in the SGLang disaggregated launcher dispatch. |
src/madengine/deployment/k8s_template_context.py |
Uses canonicalization in Kubernetes SGLang disaggregated launcher dispatch. |
src/madengine/execution/container_runner.py |
Extracts and updates Docker-local MAD_MULTI_NODE_RUNNER resolution. |
tests/unit/test_deployment.py |
Adds tests for launcher canonicalization. |
tests/unit/test_container_runner.py |
Adds tests for local launcher command and env resolution behavior. |
CLAUDE.md |
Adds behavioral guidance documentation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| elif launcher_type == "sglang": | ||
| return self._generate_sglang_command(nnodes, nproc_per_node, master_port) | ||
| elif launcher_type == "sglang-disagg" or launcher_type == "sglang_disagg": | ||
| elif canonicalize_distributed_launcher(launcher_type) == "sglang-disagg": |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR cleans up launcher name handling and the MAD_MULTI_NODE_RUNNER generation logic for Docker local deployment, building on #126.
Launcher alias canonicalization
MAD_MULTI_NODE_RUNNER resolver extraction (Docker local)
don't fail on an undefined variable.
Tests
self-managed → "", sglang_disagg alias path
97 unit tests pass.