Skip to content

[Question]: Memory ordering in GDRCopy flush #2092

@gswxp2

Description

@gswxp2

I have a question about the GDRCopy flush path and its memory ordering safety in NCCL (NCCL_GDRCOPY_FLUSH_ENABLE=1).

The scenario is: The IB NIC writes data to GPU memory via RDMA (PCIe Posted Write). The CPU reads from the same GPU memory via GDRCopy (a mov from a BAR1 mapping) to flush. After the read completes, the proxy updates the tail pointer so the GPU kernel sees the new data. Reference:

nccl/src/transport/net.cc

Lines 1552 to 1559 in 49839df

if (resources->gdcFlush) {
#if defined (__x86_64__)
// Force a PCI-E read from GPU memory
asm volatile ("mov (%0), %%eax" :: "l"(resources->gdcFlush) : "%eax", "memory");
#else
WARN("NET: GDR Flush only supported on x86_64");
return ncclInternalError;
#endif

I understand that if the NIC itself issues the flush read, ordering is guaranteed since the NIC is the same requester for both the write and the read. But with GDRCopy, the write comes from the NIC through the switch to the GPU, while the flush read comes from the CPU, a completely different requester and path. The PCIe spec doesn't require ordering between these two. Can GDRFlush be safe in this scenario?

Thanks a lot for any suggestions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions