From-scratch RDMA-based PyTorch backend in C++; trained a char-level GPT on TinyShakespeare via DDP through code I wrote.
-
Updated
May 1, 2026 - C++
From-scratch RDMA-based PyTorch backend in C++; trained a char-level GPT on TinyShakespeare via DDP through code I wrote.
Implement GPU collective communication using a custom PyTorch c10d backend built on libibverbs and softRoCE for distributed model training.
Add a description, image, and links to the c10d topic page so that developers can more easily learn about it.
To associate your repository with the c10d topic, visit your repo's landing page and select "manage topics."