Skip to content

Fix pipelined transpose empty rank skips#144

Merged
romerojosh merged 1 commit into
NVIDIA:mainfrom
fallintoplace:fix/pipelined-empty-ranks
Jun 24, 2026
Merged

Fix pipelined transpose empty rank skips#144
romerojosh merged 1 commit into
NVIDIA:mainfrom
fallintoplace:fix/pipelined-empty-ranks

Conversation

@fallintoplace

@fallintoplace fallintoplace commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Return early from pipelined all-to-all when there are no source ranks, avoiding performance-report access to an empty rank list after paired pipelined transfers clear the vectors.

Validation

  • git diff --check

@fallintoplace fallintoplace force-pushed the fix/pipelined-empty-ranks branch from 980e9ef to ad5239b Compare June 23, 2026 22:07

@romerojosh romerojosh left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fallintoplace, thanks for yet another contribution! I had to use codex to get a grasp on what exactly this PR was attempting to fix as the PR comment is a bit vague and doesn't get to the meat of what these changes are solving. Just some feedback since I know that these PRs are being driven by codex or similar tool.

For future reference, this PR seems to be specifically addressing an out of bounds access that can happen with performance reporting in the pipelined backends due to this line

if (handle->performance_report_enable && src_ranks[0] != self_rank) {

because the src_ranks and dst_ranks arrays are resized to 0 when paired like in the code here:
// Skip alltoall, this transfer was paired with previous one.
dst_ranks.resize(0);
src_ranks.resize(0);

The quick return in cudecompAlltoallPipelined is sufficient to address this issue. The other changes should be removed as per the inline comments.

Comment thread include/internal/transpose.h Outdated
Comment thread include/internal/transpose.h Outdated
Comment thread include/internal/transpose.h Outdated
Comment thread include/internal/comm_routines.h Outdated
Comment thread tests/ctest/transpose_tests.cc Outdated
Signed-off-by: Minh Vu <vuhoangminh97@gmail.com>
@fallintoplace fallintoplace force-pushed the fix/pipelined-empty-ranks branch from ad5239b to e2b51a6 Compare June 24, 2026 17:24
@romerojosh

Copy link
Copy Markdown
Collaborator

/build

@github-actions

Copy link
Copy Markdown

🚀 Build workflow triggered! View run

@github-actions

Copy link
Copy Markdown

✅ Build workflow passed! View run

@romerojosh romerojosh left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@romerojosh romerojosh merged commit 7879c14 into NVIDIA:main Jun 24, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants