Skip to content

[Question]: send in reduce-scatter VS directsend in all-reduce #2086

@GG-yuki

Description

@GG-yuki

Question

I find a interesting problem that, I thought all-reduce and reduce-scatter has the same (k - 1) steps before.
However, reduce-scatter just use send and recv, with allreduce use directsend and directrecv.
https://github.com/NVIDIA/nccl/blob/master/src/device/all_reduce.h#L48
https://github.com/NVIDIA/nccl/blob/master/src/device/reduce_scatter.h#L42

I wonder what's the difference? Or, why cause the difference? Is it related to performance?
PS: gpt-5 told me that reduce-scatter has less steps, so it can save the time of pointer conversion. Is it right?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions