How is this issue impacting you?
Lower performance than expected
Share Your Debug Logs
Logs from NCCL 29 launch, I can also attach logs from NCCL 28 if needed
40983aab902f:53830:53830 [0] NCCL INFO ENV/Plugin: Could not find: libnccl-env.so
40983aab902f:53830:53830 [0] NCCL INFO Bootstrap: Using eth0:192.168.9.2<0>
40983aab902f:53830:53830 [0] NCCL INFO cudaDriverVersion 12040
40983aab902f:53830:53830 [0] NCCL INFO NCCL version 2.29.7+cuda12.3
40983aab902f:53830:53830 [0] NCCL INFO NCCL git version unknown unknown
40983aab902f:53830:53846 [5] NCCL INFO NET/Plugin: Could not find: libnccl-net.so
40983aab902f:53830:53846 [5] NCCL INFO Failed to open libibverbs.so[.1]
40983aab902f:53830:53846 [5] NCCL INFO transport/net_ib/init.cc:396 -> 3
40983aab902f:53830:53846 [5] NCCL INFO Failed to initialize NET plugin IB
40983aab902f:53830:53846 [5] NCCL INFO NET/Socket : Using [0]eth0:192.168.9.2<0>
40983aab902f:53830:53846 [5] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53846 [5] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53841 [0] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53846 [5] NCCL INFO GIN/Plugin: Could not find: libnccl-gin.so
[2026-04-07 09:29:42] 40983aab902f:53830:53846 [5] misc/ibvwrap.cc:173 NCCL WARN lib wrapper not initialized.
40983aab902f:53830:53846 [5] NCCL INFO transport/net_ib/gdr.cc:56 -> 3
40983aab902f:53830:53846 [5] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53846 [5] NCCL INFO Using network Socket
40983aab902f:53830:53841 [0] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53841 [0] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53841 [0] NCCL INFO Using network Socket
40983aab902f:53830:53847 [6] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53847 [6] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53847 [6] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53847 [6] NCCL INFO Using network Socket
40983aab902f:53830:53844 [3] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53844 [3] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53844 [3] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53844 [3] NCCL INFO Using network Socket
40983aab902f:53830:53842 [1] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53842 [1] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53842 [1] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53842 [1] NCCL INFO Using network Socket
40983aab902f:53830:53848 [7] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53848 [7] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53848 [7] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53848 [7] NCCL INFO Using network Socket
40983aab902f:53830:53843 [2] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53843 [2] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53843 [2] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53843 [2] NCCL INFO Using network Socket
40983aab902f:53830:53845 [4] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53845 [4] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53845 [4] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53845 [4] NCCL INFO Using network Socket
40983aab902f:53830:53846 [5] NCCL INFO [Rank 5] ncclCommInitRankConfig comm 0x563cb8b747a0 rank 5 nranks 8 cudaDev 5 nvmlDev 5 busId 85000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53841 [0] NCCL INFO [Rank 0] ncclCommInitRankConfig comm 0x563cb8577020 rank 0 nranks 8 cudaDev 0 nvmlDev 0 busId 4000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53847 [6] NCCL INFO [Rank 6] ncclCommInitRankConfig comm 0x563cb8ca7160 rank 6 nranks 8 cudaDev 6 nvmlDev 6 busId 8a000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53844 [3] NCCL INFO [Rank 3] ncclCommInitRankConfig comm 0x563cb890f420 rank 3 nranks 8 cudaDev 3 nvmlDev 3 busId b000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53842 [1] NCCL INFO [Rank 1] ncclCommInitRankConfig comm 0x563cb86a9c40 rank 1 nranks 8 cudaDev 1 nvmlDev 1 busId 5000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53848 [7] NCCL INFO [Rank 7] ncclCommInitRankConfig comm 0x563cb8dd9b20 rank 7 nranks 8 cudaDev 7 nvmlDev 7 busId 8b000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53843 [2] NCCL INFO [Rank 2] ncclCommInitRankConfig comm 0x563cb87dc830 rank 2 nranks 8 cudaDev 2 nvmlDev 2 busId a000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53845 [4] NCCL INFO [Rank 4] ncclCommInitRankConfig comm 0x563cb8a41de0 rank 4 nranks 8 cudaDev 4 nvmlDev 4 busId 84000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53841 [0] NCCL INFO RAS client listening socket at ::1<28028>
40983aab902f:53830:53841 [0] NCCL INFO Bootstrap timings total 0.077588 (create 0.000046, send 0.000092, recv 0.000297, ring 0.076332, delay 0.000000)
40983aab902f:53830:53848 [7] NCCL INFO Bootstrap timings total 0.077247 (create 0.000044, send 0.000120, recv 0.000069, ring 0.076303, delay 0.000000)
40983aab902f:53830:53844 [3] NCCL INFO Bootstrap timings total 0.077463 (create 0.000038, send 0.000109, recv 0.000885, ring 0.076213, delay 0.000000)
40983aab902f:53830:53842 [1] NCCL INFO Bootstrap timings total 0.077617 (create 0.000036, send 0.000085, recv 0.000723, ring 0.076264, delay 0.000000)
40983aab902f:53830:53847 [6] NCCL INFO Bootstrap timings total 0.077877 (create 0.000050, send 0.000125, recv 0.000384, ring 0.076336, delay 0.000000)
40983aab902f:53830:53846 [5] NCCL INFO Bootstrap timings total 0.078028 (create 0.000058, send 0.000146, recv 0.000252, ring 0.000220, delay 0.000001)
40983aab902f:53830:53845 [4] NCCL INFO Bootstrap timings total 0.076721 (create 0.000049, send 0.000098, recv 0.000310, ring 0.000205, delay 0.000000)
40983aab902f:53830:53843 [2] NCCL INFO Bootstrap timings total 0.077003 (create 0.000050, send 0.000110, recv 0.000395, ring 0.076281, delay 0.000000)
40983aab902f:53830:53848 [7] NCCL INFO MNNVL busId 0x8b000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53841 [0] NCCL INFO MNNVL busId 0x4000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53847 [6] NCCL INFO MNNVL busId 0x8a000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53844 [3] NCCL INFO MNNVL busId 0xb000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53843 [2] NCCL INFO MNNVL busId 0xa000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53845 [4] NCCL INFO MNNVL busId 0x84000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53846 [5] NCCL INFO MNNVL busId 0x85000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53842 [1] NCCL INFO MNNVL busId 0x5000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53841 [0] NCCL INFO NCCL_TOPO_DUMP_FILE set by environment to ncclSystem.txt
40983aab902f:53830:53844 [3] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 3 is 0-51,104-155. (GPU affinity = 0-51,104-155 ; CPU affinity = 0-207).
40983aab902f:53830:53844 [3] NCCL INFO NVLS multicast support is available on dev 3 (NVLS_NCHANNELS 16)
40983aab902f:53830:53841 [0] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 0 is 0-51,104-155. (GPU affinity = 0-51,104-155 ; CPU affinity = 0-207).
40983aab902f:53830:53842 [1] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 1 is 0-51,104-155. (GPU affinity = 0-51,104-155 ; CPU affinity = 0-207).
40983aab902f:53830:53841 [0] NCCL INFO NVLS multicast support is available on dev 0 (NVLS_NCHANNELS 16)
40983aab902f:53830:53847 [6] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 6 is 52-103,156-207. (GPU affinity = 52-103,156-207 ; CPU affinity = 0-207).
40983aab902f:53830:53845 [4] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 4 is 52-103,156-207. (GPU affinity = 52-103,156-207 ; CPU affinity = 0-207).
40983aab902f:53830:53847 [6] NCCL INFO NVLS multicast support is available on dev 6 (NVLS_NCHANNELS 16)
40983aab902f:53830:53848 [7] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 7 is 52-103,156-207. (GPU affinity = 52-103,156-207 ; CPU affinity = 0-207).
40983aab902f:53830:53848 [7] NCCL INFO NVLS multicast support is available on dev 7 (NVLS_NCHANNELS 16)
40983aab902f:53830:53842 [1] NCCL INFO NVLS multicast support is available on dev 1 (NVLS_NCHANNELS 16)
40983aab902f:53830:53843 [2] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 2 is 0-51,104-155. (GPU affinity = 0-51,104-155 ; CPU affinity = 0-207).
40983aab902f:53830:53843 [2] NCCL INFO NVLS multicast support is available on dev 2 (NVLS_NCHANNELS 16)
40983aab902f:53830:53845 [4] NCCL INFO NVLS multicast support is available on dev 4 (NVLS_NCHANNELS 16)
40983aab902f:53830:53846 [5] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 5 is 52-103,156-207. (GPU affinity = 52-103,156-207 ; CPU affinity = 0-207).
40983aab902f:53830:53846 [5] NCCL INFO NVLS multicast support is available on dev 5 (NVLS_NCHANNELS 16)
40983aab902f:53830:53847 [6] NCCL INFO comm 0x563cb8ca7160 rank 6 nRanks 8 nNodes 1 localRanks 8 localRank 6 MNNVL 0
40983aab902f:53830:53847 [6] NCCL INFO Trees [0] 7/-1/-1->6->5 [1] 7/-1/-1->6->5 [2] 7/-1/-1->6->5 [3] 7/-1/-1->6->5 [4] 7/-1/-1->6->5 [5] 7/-1/-1->6->5 [6] 7/-1/-1->6->5 [7] 7/-1/-1->6->5 [8] 7/-1/-1->6->5 [9] 7/-1/-1->6->5 [10] 7/-1/-1->6->5 [11] 7/-1/-1->6->5 [12] 7/-1/-1->6->5 [13] 7/-1/-1->6->5 [14] 7/-1/-1->6->5 [15] 7/-1/-1->6->5 [16] 7/-1/-1->6->5 [17] 7/-1/-1->6->5 [18] 7/-1/-1->6->5 [19] 7/-1/-1->6->5 [20] 7/-1/-1->6->5 [21] 7/-1/-1->6->5 [22] 7/-1/-1->6->5 [23] 7/-1/-1->6->5
40983aab902f:53830:53847 [6] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53847 [6] NCCL INFO PROFILER/Plugin: Could not find: libnccl-profiler.so
40983aab902f:53830:53847 [6] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53846 [5] NCCL INFO comm 0x563cb8b747a0 rank 5 nRanks 8 nNodes 1 localRanks 8 localRank 5 MNNVL 0
40983aab902f:53830:53845 [4] NCCL INFO comm 0x563cb8a41de0 rank 4 nRanks 8 nNodes 1 localRanks 8 localRank 4 MNNVL 0
40983aab902f:53830:53846 [5] NCCL INFO Trees [0] 6/-1/-1->5->4 [1] 6/-1/-1->5->4 [2] 6/-1/-1->5->4 [3] 6/-1/-1->5->4 [4] 6/-1/-1->5->4 [5] 6/-1/-1->5->4 [6] 6/-1/-1->5->4 [7] 6/-1/-1->5->4 [8] 6/-1/-1->5->4 [9] 6/-1/-1->5->4 [10] 6/-1/-1->5->4 [11] 6/-1/-1->5->4 [12] 6/-1/-1->5->4 [13] 6/-1/-1->5->4 [14] 6/-1/-1->5->4 [15] 6/-1/-1->5->4 [16] 6/-1/-1->5->4 [17] 6/-1/-1->5->4 [18] 6/-1/-1->5->4 [19] 6/-1/-1->5->4 [20] 6/-1/-1->5->4 [21] 6/-1/-1->5->4 [22] 6/-1/-1->5->4 [23] 6/-1/-1->5->4
40983aab902f:53830:53845 [4] NCCL INFO Trees [0] 5/-1/-1->4->3 [1] 5/-1/-1->4->3 [2] 5/-1/-1->4->3 [3] 5/-1/-1->4->3 [4] 5/-1/-1->4->3 [5] 5/-1/-1->4->3 [6] 5/-1/-1->4->3 [7] 5/-1/-1->4->3 [8] 5/-1/-1->4->3 [9] 5/-1/-1->4->3 [10] 5/-1/-1->4->3 [11] 5/-1/-1->4->3 [12] 5/-1/-1->4->3 [13] 5/-1/-1->4->3 [14] 5/-1/-1->4->3 [15] 5/-1/-1->4->3 [16] 5/-1/-1->4->3 [17] 5/-1/-1->4->3 [18] 5/-1/-1->4->3 [19] 5/-1/-1->4->3 [20] 5/-1/-1->4->3 [21] 5/-1/-1->4->3 [22] 5/-1/-1->4->3 [23] 5/-1/-1->4->3
40983aab902f:53830:53851 [0] NCCL INFO [Proxy Service UDS] Device 6 CPU core 67
40983aab902f:53830:53845 [4] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53846 [5] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53850 [0] NCCL INFO [Proxy Service] Device 6 CPU core 61
40983aab902f:53830:53848 [7] NCCL INFO comm 0x563cb8dd9b20 rank 7 nRanks 8 nNodes 1 localRanks 8 localRank 7 MNNVL 0
40983aab902f:53830:53848 [7] NCCL INFO Trees [0] -1/-1/-1->7->6 [1] -1/-1/-1->7->6 [2] -1/-1/-1->7->6 [3] -1/-1/-1->7->6 [4] -1/-1/-1->7->6 [5] -1/-1/-1->7->6 [6] -1/-1/-1->7->6 [7] -1/-1/-1->7->6 [8] -1/-1/-1->7->6 [9] -1/-1/-1->7->6 [10] -1/-1/-1->7->6 [11] -1/-1/-1->7->6 [12] -1/-1/-1->7->6 [13] -1/-1/-1->7->6 [14] -1/-1/-1->7->6 [15] -1/-1/-1->7->6 [16] -1/-1/-1->7->6 [17] -1/-1/-1->7->6 [18] -1/-1/-1->7->6 [19] -1/-1/-1->7->6 [20] -1/-1/-1->7->6 [21] -1/-1/-1->7->6 [22] -1/-1/-1->7->6 [23] -1/-1/-1->7->6
40983aab902f:53830:53848 [7] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53841 [0] NCCL INFO comm 0x563cb8577020 rank 0 nRanks 8 nNodes 1 localRanks 8 localRank 0 MNNVL 0
40983aab902f:53830:53841 [0] NCCL INFO Channel 00/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 01/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 02/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 03/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 04/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 05/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 06/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 07/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 08/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 09/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 10/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 11/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 12/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 13/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 14/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 15/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 16/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 17/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 18/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 19/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 20/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 21/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 22/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 23/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] 1/-1/-1->0->-1 [5] 1/-1/-1->0->-1 [6] 1/-1/-1->0->-1 [7] 1/-1/-1->0->-1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] 1/-1/-1->0->-1 [13] 1/-1/-1->0->-1 [14] 1/-1/-1->0->-1 [15] 1/-1/-1->0->-1 [16] 1/-1/-1->0->-1 [17] 1/-1/-1->0->-1 [18] 1/-1/-1->0->-1 [19] 1/-1/-1->0->-1 [20] 1/-1/-1->0->-1 [21] 1/-1/-1->0->-1 [22] 1/-1/-1->0->-1 [23] 1/-1/-1->0->-1
40983aab902f:53830:53841 [0] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53843 [2] NCCL INFO comm 0x563cb87dc830 rank 2 nRanks 8 nNodes 1 localRanks 8 localRank 2 MNNVL 0
40983aab902f:53830:53843 [2] NCCL INFO Trees [0] 3/-1/-1->2->1 [1] 3/-1/-1->2->1 [2] 3/-1/-1->2->1 [3] 3/-1/-1->2->1 [4] 3/-1/-1->2->1 [5] 3/-1/-1->2->1 [6] 3/-1/-1->2->1 [7] 3/-1/-1->2->1 [8] 3/-1/-1->2->1 [9] 3/-1/-1->2->1 [10] 3/-1/-1->2->1 [11] 3/-1/-1->2->1 [12] 3/-1/-1->2->1 [13] 3/-1/-1->2->1 [14] 3/-1/-1->2->1 [15] 3/-1/-1->2->1 [16] 3/-1/-1->2->1 [17] 3/-1/-1->2->1 [18] 3/-1/-1->2->1 [19] 3/-1/-1->2->1 [20] 3/-1/-1->2->1 [21] 3/-1/-1->2->1 [22] 3/-1/-1->2->1 [23] 3/-1/-1->2->1
40983aab902f:53830:53843 [2] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53844 [3] NCCL INFO comm 0x563cb890f420 rank 3 nRanks 8 nNodes 1 localRanks 8 localRank 3 MNNVL 0
40983aab902f:53830:53842 [1] NCCL INFO comm 0x563cb86a9c40 rank 1 nRanks 8 nNodes 1 localRanks 8 localRank 1 MNNVL 0
40983aab902f:53830:53844 [3] NCCL INFO Trees [0] 4/-1/-1->3->2 [1] 4/-1/-1->3->2 [2] 4/-1/-1->3->2 [3] 4/-1/-1->3->2 [4] 4/-1/-1->3->2 [5] 4/-1/-1->3->2 [6] 4/-1/-1->3->2 [7] 4/-1/-1->3->2 [8] 4/-1/-1->3->2 [9] 4/-1/-1->3->2 [10] 4/-1/-1->3->2 [11] 4/-1/-1->3->2 [12] 4/-1/-1->3->2 [13] 4/-1/-1->3->2 [14] 4/-1/-1->3->2 [15] 4/-1/-1->3->2 [16] 4/-1/-1->3->2 [17] 4/-1/-1->3->2 [18] 4/-1/-1->3->2 [19] 4/-1/-1->3->2 [20] 4/-1/-1->3->2 [21] 4/-1/-1->3->2 [22] 4/-1/-1->3->2 [23] 4/-1/-1->3->2
40983aab902f:53830:53844 [3] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53842 [1] NCCL INFO Trees [0] 2/-1/-1->1->0 [1] 2/-1/-1->1->0 [2] 2/-1/-1->1->0 [3] 2/-1/-1->1->0 [4] 2/-1/-1->1->0 [5] 2/-1/-1->1->0 [6] 2/-1/-1->1->0 [7] 2/-1/-1->1->0 [8] 2/-1/-1->1->0 [9] 2/-1/-1->1->0 [10] 2/-1/-1->1->0 [11] 2/-1/-1->1->0 [12] 2/-1/-1->1->0 [13] 2/-1/-1->1->0 [14] 2/-1/-1->1->0 [15] 2/-1/-1->1->0 [16] 2/-1/-1->1->0 [17] 2/-1/-1->1->0 [18] 2/-1/-1->1->0 [19] 2/-1/-1->1->0 [20] 2/-1/-1->1->0 [21] 2/-1/-1->1->0 [22] 2/-1/-1->1->0 [23] 2/-1/-1->1->0
40983aab902f:53830:53842 [1] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53846 [5] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53852 [0] NCCL INFO [Proxy Service] Device 5 CPU core 176
40983aab902f:53830:53853 [0] NCCL INFO [Proxy Service UDS] Device 5 CPU core 75
40983aab902f:53830:53845 [4] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53855 [0] NCCL INFO [Proxy Service UDS] Device 4 CPU core 181
40983aab902f:53830:53854 [0] NCCL INFO [Proxy Service] Device 4 CPU core 170
40983aab902f:53830:53848 [7] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53856 [0] NCCL INFO [Proxy Service] Device 7 CPU core 204
40983aab902f:53830:53857 [0] NCCL INFO [Proxy Service UDS] Device 7 CPU core 187
40983aab902f:53830:53843 [2] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53858 [0] NCCL INFO [Proxy Service] Device 2 CPU core 143
40983aab902f:53830:53859 [0] NCCL INFO [Proxy Service UDS] Device 2 CPU core 146
40983aab902f:53830:53841 [0] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53860 [0] NCCL INFO [Proxy Service] Device 0 CPU core 8
40983aab902f:53830:53861 [0] NCCL INFO [Proxy Service UDS] Device 0 CPU core 17
40983aab902f:53830:53842 [1] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53862 [0] NCCL INFO [Proxy Service] Device 1 CPU core 18
40983aab902f:53830:53863 [0] NCCL INFO [Proxy Service UDS] Device 1 CPU core 19
40983aab902f:53830:53844 [3] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53864 [0] NCCL INFO [Proxy Service] Device 3 CPU core 2
40983aab902f:53830:53865 [0] NCCL INFO [Proxy Service UDS] Device 3 CPU core 125
40983aab902f:53830:53842 [1] NCCL INFO TUNER/Plugin: Could not find: libnccl-tuner.so
40983aab902f:53830:53842 [1] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53842 [1] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53842 [1] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53841 [0] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53841 [0] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53841 [0] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53848 [7] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53848 [7] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53848 [7] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53845 [4] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53845 [4] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53845 [4] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53847 [6] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53847 [6] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53847 [6] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53844 [3] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53844 [3] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53844 [3] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53846 [5] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53846 [5] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53846 [5] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53841 [0] NCCL INFO CC Off, workFifoBytes 1048576
40983aab902f:53830:53843 [2] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53843 [2] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53843 [2] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53842 [1] NCCL INFO ncclCommInitRankConfig comm 0x563cb86a9c40 rank 1 nranks 8 cudaDev 1 nvmlDev 1 busId 5000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53842 [1] NCCL INFO Init timings - ncclCommInitRankConfig: rank 1 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.03, topo 1.10, graphs 0.05, connections 0.27, rest 0.03)
40983aab902f:53830:53846 [5] NCCL INFO ncclCommInitRankConfig comm 0x563cb8b747a0 rank 5 nranks 8 cudaDev 5 nvmlDev 5 busId 85000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53846 [5] NCCL INFO Init timings - ncclCommInitRankConfig: rank 5 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.06, topo 1.11, graphs 0.02, connections 0.29, rest 0.01)
40983aab902f:53830:53841 [0] NCCL INFO ncclCommInitRankConfig comm 0x563cb8577020 rank 0 nranks 8 cudaDev 0 nvmlDev 0 busId 4000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53841 [0] NCCL INFO Init timings - ncclCommInitRankConfig: rank 0 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.03, topo 1.10, graphs 0.05, connections 0.28, rest 0.02)
40983aab902f:53830:53844 [3] NCCL INFO ncclCommInitRankConfig comm 0x563cb890f420 rank 3 nranks 8 cudaDev 3 nvmlDev 3 busId b000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53844 [3] NCCL INFO Init timings - ncclCommInitRankConfig: rank 3 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.06, topo 1.10, graphs 0.03, connections 0.27, rest 0.03)
40983aab902f:53830:53845 [4] NCCL INFO ncclCommInitRankConfig comm 0x563cb8a41de0 rank 4 nranks 8 cudaDev 4 nvmlDev 4 busId 84000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53845 [4] NCCL INFO Init timings - ncclCommInitRankConfig: rank 4 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.05, topo 1.11, graphs 0.03, connections 0.29, rest 0.02)
40983aab902f:53830:53843 [2] NCCL INFO ncclCommInitRankConfig comm 0x563cb87dc830 rank 2 nranks 8 cudaDev 2 nvmlDev 2 busId a000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53843 [2] NCCL INFO Init timings - ncclCommInitRankConfig: rank 2 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.07, topo 1.11, graphs 0.01, connections 0.28, rest 0.02)
40983aab902f:53830:53848 [7] NCCL INFO ncclCommInitRankConfig comm 0x563cb8dd9b20 rank 7 nranks 8 cudaDev 7 nvmlDev 7 busId 8b000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53848 [7] NCCL INFO Init timings - ncclCommInitRankConfig: rank 7 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.07, topo 1.11, graphs 0.02, connections 0.28, rest 0.02)
40983aab902f:53830:53847 [6] NCCL INFO ncclCommInitRankConfig comm 0x563cb8ca7160 rank 6 nranks 8 cudaDev 6 nvmlDev 6 busId 8a000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53847 [6] NCCL INFO Init timings - ncclCommInitRankConfig: rank 6 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.05, topo 1.11, graphs 0.03, connections 0.31, rest 0.01)
40983aab902f:53830:53830 [7] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [6] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [5] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [4] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [3] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [2] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [1] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [0] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53868 [5] NCCL INFO Channel 00/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 00/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 01/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 02/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 03/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 01/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 04/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 02/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 00/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 03/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 05/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 01/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 00/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 06/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 02/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 00/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 00/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 04/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 07/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 01/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 01/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 05/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 03/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 01/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 08/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 02/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 02/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 06/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 04/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 03/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 05/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 02/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 03/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 07/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 00/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 04/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 09/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 06/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 08/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 03/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 10/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 04/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 04/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 01/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 05/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 05/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 09/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 07/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 11/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 06/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 10/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 12/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 06/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 05/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 08/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 07/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 02/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 11/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 13/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 07/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 06/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 09/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 03/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 12/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 08/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 08/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 14/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 10/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 13/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 09/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 09/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 04/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 07/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 11/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 15/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 08/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 10/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 05/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 12/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 14/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 09/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 11/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 10/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 16/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 15/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 06/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 12/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 06/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 10/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 16/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 13/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 13/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 07/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 11/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 17/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 11/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 07/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 14/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 12/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 17/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 18/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 14/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 08/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 15/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 08/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 13/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 19/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 12/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 09/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 09/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 14/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 18/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 15/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 20/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 10/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 16/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 15/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 13/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 16/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 11/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 17/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 10/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 14/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 21/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 12/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 16/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 19/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 17/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 22/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 18/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 20/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 11/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 18/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 23/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 13/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 21/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 17/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 15/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 19/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 19/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 12/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 14/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 22/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 20/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 16/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 20/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 13/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 18/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 23/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 15/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 16/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 14/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 21/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 17/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 21/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 19/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 18/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 22/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 17/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 15/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 20/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 23/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 19/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 18/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 21/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 22/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 22/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 19/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 20/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 23/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 16/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 23/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 21/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 20/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 17/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 21/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 22/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 23/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 18/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 22/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 23/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 19/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 20/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 21/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 22/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 23/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53869 [4] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53871 [2] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53866 [7] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53867 [6] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53870 [3] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53868 [5] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53873 [0] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53830 [0] NCCL INFO comm 0x563cb8577020 rank 0 nranks 8 cudaDev 0 busId 4000 - Destroy COMPLETE
40983aab902f:53830:53830 [7] NCCL INFO comm 0x563cb8dd9b20 rank 7 nranks 8 cudaDev 7 busId 8b000 - Destroy COMPLETE
40983aab902f:53830:53830 [6] NCCL INFO comm 0x563cb8ca7160 rank 6 nranks 8 cudaDev 6 busId 8a000 - Destroy COMPLETE
40983aab902f:53830:53830 [5] NCCL INFO comm 0x563cb8b747a0 rank 5 nranks 8 cudaDev 5 busId 85000 - Destroy COMPLETE
40983aab902f:53830:53830 [4] NCCL INFO comm 0x563cb8a41de0 rank 4 nranks 8 cudaDev 4 busId 84000 - Destroy COMPLETE
40983aab902f:53830:53830 [3] NCCL INFO comm 0x563cb890f420 rank 3 nranks 8 cudaDev 3 busId b000 - Destroy COMPLETE
40983aab902f:53830:53830 [2] NCCL INFO comm 0x563cb87dc830 rank 2 nranks 8 cudaDev 2 busId a000 - Destroy COMPLETE
40983aab902f:53830:53830 [1] NCCL INFO comm 0x563cb86a9c40 rank 1 nranks 8 cudaDev 1 busId 5000 - Destroy COMPLETE
40983aab902f:53830:53830 [7] NCCL INFO ENV/Plugin: Closing env plugin ncclEnvDefault
Steps to Reproduce the Issue
- build nccl 28 and nccl 29 + nccl-tests (I'm using clang as a C compiler if it's important).
- Run ./build/all_reduce_perf -g 8 -b 1024 -e 63488 -f 2 -n 5000 -w 1000 against NCCL 28 and 29
Results:
NCCL 29:
# out-of-place in-place
# size count type redop root time algbw busbw #wrong time algbw busbw #wrong
# (B) (elements) (us) (GB/s) (GB/s) (us) (GB/s) (GB/s)
1024 256 float sum -1 37.90 0.03 0.05 0 36.62 0.03 0.05 0
2048 512 float sum -1 37.08 0.06 0.10 0 36.50 0.06 0.10 0
4096 1024 float sum -1 37.35 0.11 0.19 0 37.97 0.11 0.19 0
8192 2048 float sum -1 38.38 0.21 0.37 0 37.66 0.22 0.38 0
16384 4096 float sum -1 38.36 0.43 0.75 0 38.51 0.43 0.74 0
32768 8192 float sum -1 38.25 0.86 1.50 0 37.68 0.87 1.52 0
# Out of bounds values : 0 OK
# Avg bus bandwidth : 0.494946
NCCL 28:
# out-of-place in-place
# size count type redop root time algbw busbw #wrong time algbw busbw #wrong
# (B) (elements) (us) (GB/s) (GB/s) (us) (GB/s) (GB/s)
1024 256 float sum -1 34.57 0.03 0.05 0 33.18 0.03 0.05 0
2048 512 float sum -1 33.40 0.06 0.11 0 32.92 0.06 0.11 0
4096 1024 float sum -1 32.51 0.13 0.22 0 33.21 0.12 0.22 0
8192 2048 float sum -1 32.89 0.25 0.44 0 33.51 0.24 0.43 0
16384 4096 float sum -1 32.41 0.51 0.88 0 32.86 0.50 0.87 0
32768 8192 float sum -1 33.56 0.98 1.71 0 33.46 0.98 1.71 0
# Out of bounds values : 0 OK
# Avg bus bandwidth : 0.566818
Latency with NCCL 29 seems to be ~10 lower. Is this expected?
NCCL Version
29.7+cuda12.3
Your platform details
Simple 8xH100 machine.
Error Message & Behavior
Latency regressed by ~10%
How is this issue impacting you?
Lower performance than expected
Share Your Debug Logs
Logs from NCCL 29 launch, I can also attach logs from NCCL 28 if needed
40983aab902f:53830:53830 [0] NCCL INFO ENV/Plugin: Could not find: libnccl-env.so
40983aab902f:53830:53830 [0] NCCL INFO Bootstrap: Using eth0:192.168.9.2<0>
40983aab902f:53830:53830 [0] NCCL INFO cudaDriverVersion 12040
40983aab902f:53830:53830 [0] NCCL INFO NCCL version 2.29.7+cuda12.3
40983aab902f:53830:53830 [0] NCCL INFO NCCL git version unknown unknown
40983aab902f:53830:53846 [5] NCCL INFO NET/Plugin: Could not find: libnccl-net.so
40983aab902f:53830:53846 [5] NCCL INFO Failed to open libibverbs.so[.1]
40983aab902f:53830:53846 [5] NCCL INFO transport/net_ib/init.cc:396 -> 3
40983aab902f:53830:53846 [5] NCCL INFO Failed to initialize NET plugin IB
40983aab902f:53830:53846 [5] NCCL INFO NET/Socket : Using [0]eth0:192.168.9.2<0>
40983aab902f:53830:53846 [5] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53846 [5] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53841 [0] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53846 [5] NCCL INFO GIN/Plugin: Could not find: libnccl-gin.so
[2026-04-07 09:29:42] 40983aab902f:53830:53846 [5] misc/ibvwrap.cc:173 NCCL WARN lib wrapper not initialized.
40983aab902f:53830:53846 [5] NCCL INFO transport/net_ib/gdr.cc:56 -> 3
40983aab902f:53830:53846 [5] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53846 [5] NCCL INFO Using network Socket
40983aab902f:53830:53841 [0] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53841 [0] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53841 [0] NCCL INFO Using network Socket
40983aab902f:53830:53847 [6] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53847 [6] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53847 [6] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53847 [6] NCCL INFO Using network Socket
40983aab902f:53830:53844 [3] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53844 [3] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53844 [3] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53844 [3] NCCL INFO Using network Socket
40983aab902f:53830:53842 [1] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53842 [1] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53842 [1] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53842 [1] NCCL INFO Using network Socket
40983aab902f:53830:53848 [7] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53848 [7] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53848 [7] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53848 [7] NCCL INFO Using network Socket
40983aab902f:53830:53843 [2] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53843 [2] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53843 [2] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53843 [2] NCCL INFO Using network Socket
40983aab902f:53830:53845 [4] NCCL INFO Initialized NET plugin Socket
40983aab902f:53830:53845 [4] NCCL INFO Assigned NET plugin Socket to comm
40983aab902f:53830:53845 [4] NCCL INFO Failed to initialize any GIN plugin
40983aab902f:53830:53845 [4] NCCL INFO Using network Socket
40983aab902f:53830:53846 [5] NCCL INFO [Rank 5] ncclCommInitRankConfig comm 0x563cb8b747a0 rank 5 nranks 8 cudaDev 5 nvmlDev 5 busId 85000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53841 [0] NCCL INFO [Rank 0] ncclCommInitRankConfig comm 0x563cb8577020 rank 0 nranks 8 cudaDev 0 nvmlDev 0 busId 4000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53847 [6] NCCL INFO [Rank 6] ncclCommInitRankConfig comm 0x563cb8ca7160 rank 6 nranks 8 cudaDev 6 nvmlDev 6 busId 8a000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53844 [3] NCCL INFO [Rank 3] ncclCommInitRankConfig comm 0x563cb890f420 rank 3 nranks 8 cudaDev 3 nvmlDev 3 busId b000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53842 [1] NCCL INFO [Rank 1] ncclCommInitRankConfig comm 0x563cb86a9c40 rank 1 nranks 8 cudaDev 1 nvmlDev 1 busId 5000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53848 [7] NCCL INFO [Rank 7] ncclCommInitRankConfig comm 0x563cb8dd9b20 rank 7 nranks 8 cudaDev 7 nvmlDev 7 busId 8b000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53843 [2] NCCL INFO [Rank 2] ncclCommInitRankConfig comm 0x563cb87dc830 rank 2 nranks 8 cudaDev 2 nvmlDev 2 busId a000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53845 [4] NCCL INFO [Rank 4] ncclCommInitRankConfig comm 0x563cb8a41de0 rank 4 nranks 8 cudaDev 4 nvmlDev 4 busId 84000 commId 0xfa356df9cba89de0 - Init START
40983aab902f:53830:53841 [0] NCCL INFO RAS client listening socket at ::1<28028>
40983aab902f:53830:53841 [0] NCCL INFO Bootstrap timings total 0.077588 (create 0.000046, send 0.000092, recv 0.000297, ring 0.076332, delay 0.000000)
40983aab902f:53830:53848 [7] NCCL INFO Bootstrap timings total 0.077247 (create 0.000044, send 0.000120, recv 0.000069, ring 0.076303, delay 0.000000)
40983aab902f:53830:53844 [3] NCCL INFO Bootstrap timings total 0.077463 (create 0.000038, send 0.000109, recv 0.000885, ring 0.076213, delay 0.000000)
40983aab902f:53830:53842 [1] NCCL INFO Bootstrap timings total 0.077617 (create 0.000036, send 0.000085, recv 0.000723, ring 0.076264, delay 0.000000)
40983aab902f:53830:53847 [6] NCCL INFO Bootstrap timings total 0.077877 (create 0.000050, send 0.000125, recv 0.000384, ring 0.076336, delay 0.000000)
40983aab902f:53830:53846 [5] NCCL INFO Bootstrap timings total 0.078028 (create 0.000058, send 0.000146, recv 0.000252, ring 0.000220, delay 0.000001)
40983aab902f:53830:53845 [4] NCCL INFO Bootstrap timings total 0.076721 (create 0.000049, send 0.000098, recv 0.000310, ring 0.000205, delay 0.000000)
40983aab902f:53830:53843 [2] NCCL INFO Bootstrap timings total 0.077003 (create 0.000050, send 0.000110, recv 0.000395, ring 0.076281, delay 0.000000)
40983aab902f:53830:53848 [7] NCCL INFO MNNVL busId 0x8b000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53841 [0] NCCL INFO MNNVL busId 0x4000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53847 [6] NCCL INFO MNNVL busId 0x8a000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53844 [3] NCCL INFO MNNVL busId 0xb000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53843 [2] NCCL INFO MNNVL busId 0xa000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53845 [4] NCCL INFO MNNVL busId 0x84000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53846 [5] NCCL INFO MNNVL busId 0x85000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53842 [1] NCCL INFO MNNVL busId 0x5000 fabric UUID 0.0 cliqueId 0x0 state 3 healthMask 0x0
40983aab902f:53830:53841 [0] NCCL INFO NCCL_TOPO_DUMP_FILE set by environment to ncclSystem.txt
40983aab902f:53830:53844 [3] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 3 is 0-51,104-155. (GPU affinity = 0-51,104-155 ; CPU affinity = 0-207).
40983aab902f:53830:53844 [3] NCCL INFO NVLS multicast support is available on dev 3 (NVLS_NCHANNELS 16)
40983aab902f:53830:53841 [0] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 0 is 0-51,104-155. (GPU affinity = 0-51,104-155 ; CPU affinity = 0-207).
40983aab902f:53830:53842 [1] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 1 is 0-51,104-155. (GPU affinity = 0-51,104-155 ; CPU affinity = 0-207).
40983aab902f:53830:53841 [0] NCCL INFO NVLS multicast support is available on dev 0 (NVLS_NCHANNELS 16)
40983aab902f:53830:53847 [6] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 6 is 52-103,156-207. (GPU affinity = 52-103,156-207 ; CPU affinity = 0-207).
40983aab902f:53830:53845 [4] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 4 is 52-103,156-207. (GPU affinity = 52-103,156-207 ; CPU affinity = 0-207).
40983aab902f:53830:53847 [6] NCCL INFO NVLS multicast support is available on dev 6 (NVLS_NCHANNELS 16)
40983aab902f:53830:53848 [7] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 7 is 52-103,156-207. (GPU affinity = 52-103,156-207 ; CPU affinity = 0-207).
40983aab902f:53830:53848 [7] NCCL INFO NVLS multicast support is available on dev 7 (NVLS_NCHANNELS 16)
40983aab902f:53830:53842 [1] NCCL INFO NVLS multicast support is available on dev 1 (NVLS_NCHANNELS 16)
40983aab902f:53830:53843 [2] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 2 is 0-51,104-155. (GPU affinity = 0-51,104-155 ; CPU affinity = 0-207).
40983aab902f:53830:53843 [2] NCCL INFO NVLS multicast support is available on dev 2 (NVLS_NCHANNELS 16)
40983aab902f:53830:53845 [4] NCCL INFO NVLS multicast support is available on dev 4 (NVLS_NCHANNELS 16)
40983aab902f:53830:53846 [5] NCCL INFO ncclTopoGetCpuAffinity: Affinity for GPU 5 is 52-103,156-207. (GPU affinity = 52-103,156-207 ; CPU affinity = 0-207).
40983aab902f:53830:53846 [5] NCCL INFO NVLS multicast support is available on dev 5 (NVLS_NCHANNELS 16)
40983aab902f:53830:53847 [6] NCCL INFO comm 0x563cb8ca7160 rank 6 nRanks 8 nNodes 1 localRanks 8 localRank 6 MNNVL 0
40983aab902f:53830:53847 [6] NCCL INFO Trees [0] 7/-1/-1->6->5 [1] 7/-1/-1->6->5 [2] 7/-1/-1->6->5 [3] 7/-1/-1->6->5 [4] 7/-1/-1->6->5 [5] 7/-1/-1->6->5 [6] 7/-1/-1->6->5 [7] 7/-1/-1->6->5 [8] 7/-1/-1->6->5 [9] 7/-1/-1->6->5 [10] 7/-1/-1->6->5 [11] 7/-1/-1->6->5 [12] 7/-1/-1->6->5 [13] 7/-1/-1->6->5 [14] 7/-1/-1->6->5 [15] 7/-1/-1->6->5 [16] 7/-1/-1->6->5 [17] 7/-1/-1->6->5 [18] 7/-1/-1->6->5 [19] 7/-1/-1->6->5 [20] 7/-1/-1->6->5 [21] 7/-1/-1->6->5 [22] 7/-1/-1->6->5 [23] 7/-1/-1->6->5
40983aab902f:53830:53847 [6] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53847 [6] NCCL INFO PROFILER/Plugin: Could not find: libnccl-profiler.so
40983aab902f:53830:53847 [6] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53846 [5] NCCL INFO comm 0x563cb8b747a0 rank 5 nRanks 8 nNodes 1 localRanks 8 localRank 5 MNNVL 0
40983aab902f:53830:53845 [4] NCCL INFO comm 0x563cb8a41de0 rank 4 nRanks 8 nNodes 1 localRanks 8 localRank 4 MNNVL 0
40983aab902f:53830:53846 [5] NCCL INFO Trees [0] 6/-1/-1->5->4 [1] 6/-1/-1->5->4 [2] 6/-1/-1->5->4 [3] 6/-1/-1->5->4 [4] 6/-1/-1->5->4 [5] 6/-1/-1->5->4 [6] 6/-1/-1->5->4 [7] 6/-1/-1->5->4 [8] 6/-1/-1->5->4 [9] 6/-1/-1->5->4 [10] 6/-1/-1->5->4 [11] 6/-1/-1->5->4 [12] 6/-1/-1->5->4 [13] 6/-1/-1->5->4 [14] 6/-1/-1->5->4 [15] 6/-1/-1->5->4 [16] 6/-1/-1->5->4 [17] 6/-1/-1->5->4 [18] 6/-1/-1->5->4 [19] 6/-1/-1->5->4 [20] 6/-1/-1->5->4 [21] 6/-1/-1->5->4 [22] 6/-1/-1->5->4 [23] 6/-1/-1->5->4
40983aab902f:53830:53845 [4] NCCL INFO Trees [0] 5/-1/-1->4->3 [1] 5/-1/-1->4->3 [2] 5/-1/-1->4->3 [3] 5/-1/-1->4->3 [4] 5/-1/-1->4->3 [5] 5/-1/-1->4->3 [6] 5/-1/-1->4->3 [7] 5/-1/-1->4->3 [8] 5/-1/-1->4->3 [9] 5/-1/-1->4->3 [10] 5/-1/-1->4->3 [11] 5/-1/-1->4->3 [12] 5/-1/-1->4->3 [13] 5/-1/-1->4->3 [14] 5/-1/-1->4->3 [15] 5/-1/-1->4->3 [16] 5/-1/-1->4->3 [17] 5/-1/-1->4->3 [18] 5/-1/-1->4->3 [19] 5/-1/-1->4->3 [20] 5/-1/-1->4->3 [21] 5/-1/-1->4->3 [22] 5/-1/-1->4->3 [23] 5/-1/-1->4->3
40983aab902f:53830:53851 [0] NCCL INFO [Proxy Service UDS] Device 6 CPU core 67
40983aab902f:53830:53845 [4] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53846 [5] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53850 [0] NCCL INFO [Proxy Service] Device 6 CPU core 61
40983aab902f:53830:53848 [7] NCCL INFO comm 0x563cb8dd9b20 rank 7 nRanks 8 nNodes 1 localRanks 8 localRank 7 MNNVL 0
40983aab902f:53830:53848 [7] NCCL INFO Trees [0] -1/-1/-1->7->6 [1] -1/-1/-1->7->6 [2] -1/-1/-1->7->6 [3] -1/-1/-1->7->6 [4] -1/-1/-1->7->6 [5] -1/-1/-1->7->6 [6] -1/-1/-1->7->6 [7] -1/-1/-1->7->6 [8] -1/-1/-1->7->6 [9] -1/-1/-1->7->6 [10] -1/-1/-1->7->6 [11] -1/-1/-1->7->6 [12] -1/-1/-1->7->6 [13] -1/-1/-1->7->6 [14] -1/-1/-1->7->6 [15] -1/-1/-1->7->6 [16] -1/-1/-1->7->6 [17] -1/-1/-1->7->6 [18] -1/-1/-1->7->6 [19] -1/-1/-1->7->6 [20] -1/-1/-1->7->6 [21] -1/-1/-1->7->6 [22] -1/-1/-1->7->6 [23] -1/-1/-1->7->6
40983aab902f:53830:53848 [7] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53841 [0] NCCL INFO comm 0x563cb8577020 rank 0 nRanks 8 nNodes 1 localRanks 8 localRank 0 MNNVL 0
40983aab902f:53830:53841 [0] NCCL INFO Channel 00/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 01/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 02/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 03/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 04/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 05/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 06/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 07/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 08/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 09/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 10/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 11/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 12/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 13/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 14/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 15/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 16/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 17/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 18/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 19/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 20/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 21/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 22/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Channel 23/24 : 0 1 2 3 4 5 6 7
40983aab902f:53830:53841 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] 1/-1/-1->0->-1 [5] 1/-1/-1->0->-1 [6] 1/-1/-1->0->-1 [7] 1/-1/-1->0->-1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] 1/-1/-1->0->-1 [13] 1/-1/-1->0->-1 [14] 1/-1/-1->0->-1 [15] 1/-1/-1->0->-1 [16] 1/-1/-1->0->-1 [17] 1/-1/-1->0->-1 [18] 1/-1/-1->0->-1 [19] 1/-1/-1->0->-1 [20] 1/-1/-1->0->-1 [21] 1/-1/-1->0->-1 [22] 1/-1/-1->0->-1 [23] 1/-1/-1->0->-1
40983aab902f:53830:53841 [0] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53843 [2] NCCL INFO comm 0x563cb87dc830 rank 2 nRanks 8 nNodes 1 localRanks 8 localRank 2 MNNVL 0
40983aab902f:53830:53843 [2] NCCL INFO Trees [0] 3/-1/-1->2->1 [1] 3/-1/-1->2->1 [2] 3/-1/-1->2->1 [3] 3/-1/-1->2->1 [4] 3/-1/-1->2->1 [5] 3/-1/-1->2->1 [6] 3/-1/-1->2->1 [7] 3/-1/-1->2->1 [8] 3/-1/-1->2->1 [9] 3/-1/-1->2->1 [10] 3/-1/-1->2->1 [11] 3/-1/-1->2->1 [12] 3/-1/-1->2->1 [13] 3/-1/-1->2->1 [14] 3/-1/-1->2->1 [15] 3/-1/-1->2->1 [16] 3/-1/-1->2->1 [17] 3/-1/-1->2->1 [18] 3/-1/-1->2->1 [19] 3/-1/-1->2->1 [20] 3/-1/-1->2->1 [21] 3/-1/-1->2->1 [22] 3/-1/-1->2->1 [23] 3/-1/-1->2->1
40983aab902f:53830:53843 [2] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53844 [3] NCCL INFO comm 0x563cb890f420 rank 3 nRanks 8 nNodes 1 localRanks 8 localRank 3 MNNVL 0
40983aab902f:53830:53842 [1] NCCL INFO comm 0x563cb86a9c40 rank 1 nRanks 8 nNodes 1 localRanks 8 localRank 1 MNNVL 0
40983aab902f:53830:53844 [3] NCCL INFO Trees [0] 4/-1/-1->3->2 [1] 4/-1/-1->3->2 [2] 4/-1/-1->3->2 [3] 4/-1/-1->3->2 [4] 4/-1/-1->3->2 [5] 4/-1/-1->3->2 [6] 4/-1/-1->3->2 [7] 4/-1/-1->3->2 [8] 4/-1/-1->3->2 [9] 4/-1/-1->3->2 [10] 4/-1/-1->3->2 [11] 4/-1/-1->3->2 [12] 4/-1/-1->3->2 [13] 4/-1/-1->3->2 [14] 4/-1/-1->3->2 [15] 4/-1/-1->3->2 [16] 4/-1/-1->3->2 [17] 4/-1/-1->3->2 [18] 4/-1/-1->3->2 [19] 4/-1/-1->3->2 [20] 4/-1/-1->3->2 [21] 4/-1/-1->3->2 [22] 4/-1/-1->3->2 [23] 4/-1/-1->3->2
40983aab902f:53830:53844 [3] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53842 [1] NCCL INFO Trees [0] 2/-1/-1->1->0 [1] 2/-1/-1->1->0 [2] 2/-1/-1->1->0 [3] 2/-1/-1->1->0 [4] 2/-1/-1->1->0 [5] 2/-1/-1->1->0 [6] 2/-1/-1->1->0 [7] 2/-1/-1->1->0 [8] 2/-1/-1->1->0 [9] 2/-1/-1->1->0 [10] 2/-1/-1->1->0 [11] 2/-1/-1->1->0 [12] 2/-1/-1->1->0 [13] 2/-1/-1->1->0 [14] 2/-1/-1->1->0 [15] 2/-1/-1->1->0 [16] 2/-1/-1->1->0 [17] 2/-1/-1->1->0 [18] 2/-1/-1->1->0 [19] 2/-1/-1->1->0 [20] 2/-1/-1->1->0 [21] 2/-1/-1->1->0 [22] 2/-1/-1->1->0 [23] 2/-1/-1->1->0
40983aab902f:53830:53842 [1] NCCL INFO P2P Chunksize set to 524288
40983aab902f:53830:53846 [5] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53852 [0] NCCL INFO [Proxy Service] Device 5 CPU core 176
40983aab902f:53830:53853 [0] NCCL INFO [Proxy Service UDS] Device 5 CPU core 75
40983aab902f:53830:53845 [4] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53855 [0] NCCL INFO [Proxy Service UDS] Device 4 CPU core 181
40983aab902f:53830:53854 [0] NCCL INFO [Proxy Service] Device 4 CPU core 170
40983aab902f:53830:53848 [7] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53856 [0] NCCL INFO [Proxy Service] Device 7 CPU core 204
40983aab902f:53830:53857 [0] NCCL INFO [Proxy Service UDS] Device 7 CPU core 187
40983aab902f:53830:53843 [2] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53858 [0] NCCL INFO [Proxy Service] Device 2 CPU core 143
40983aab902f:53830:53859 [0] NCCL INFO [Proxy Service UDS] Device 2 CPU core 146
40983aab902f:53830:53841 [0] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53860 [0] NCCL INFO [Proxy Service] Device 0 CPU core 8
40983aab902f:53830:53861 [0] NCCL INFO [Proxy Service UDS] Device 0 CPU core 17
40983aab902f:53830:53842 [1] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53862 [0] NCCL INFO [Proxy Service] Device 1 CPU core 18
40983aab902f:53830:53863 [0] NCCL INFO [Proxy Service UDS] Device 1 CPU core 19
40983aab902f:53830:53844 [3] NCCL INFO Check P2P Type isAllDirectP2p 1 directMode 1 isAllCudaP2p 1
40983aab902f:53830:53864 [0] NCCL INFO [Proxy Service] Device 3 CPU core 2
40983aab902f:53830:53865 [0] NCCL INFO [Proxy Service UDS] Device 3 CPU core 125
40983aab902f:53830:53842 [1] NCCL INFO TUNER/Plugin: Could not find: libnccl-tuner.so
40983aab902f:53830:53842 [1] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53842 [1] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53842 [1] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53841 [0] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53841 [0] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53841 [0] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53848 [7] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53848 [7] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53848 [7] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53845 [4] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53845 [4] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53845 [4] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53847 [6] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53847 [6] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53847 [6] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53844 [3] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53844 [3] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53844 [3] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53846 [5] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53846 [5] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53846 [5] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53841 [0] NCCL INFO CC Off, workFifoBytes 1048576
40983aab902f:53830:53843 [2] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512
40983aab902f:53830:53843 [2] NCCL INFO 24 coll channels, 24 collnet channels, 16 nvls channels, 32 p2p channels, 32 p2p channels per peer
40983aab902f:53830:53843 [2] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53842 [1] NCCL INFO ncclCommInitRankConfig comm 0x563cb86a9c40 rank 1 nranks 8 cudaDev 1 nvmlDev 1 busId 5000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53842 [1] NCCL INFO Init timings - ncclCommInitRankConfig: rank 1 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.03, topo 1.10, graphs 0.05, connections 0.27, rest 0.03)
40983aab902f:53830:53846 [5] NCCL INFO ncclCommInitRankConfig comm 0x563cb8b747a0 rank 5 nranks 8 cudaDev 5 nvmlDev 5 busId 85000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53846 [5] NCCL INFO Init timings - ncclCommInitRankConfig: rank 5 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.06, topo 1.11, graphs 0.02, connections 0.29, rest 0.01)
40983aab902f:53830:53841 [0] NCCL INFO ncclCommInitRankConfig comm 0x563cb8577020 rank 0 nranks 8 cudaDev 0 nvmlDev 0 busId 4000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53841 [0] NCCL INFO Init timings - ncclCommInitRankConfig: rank 0 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.03, topo 1.10, graphs 0.05, connections 0.28, rest 0.02)
40983aab902f:53830:53844 [3] NCCL INFO ncclCommInitRankConfig comm 0x563cb890f420 rank 3 nranks 8 cudaDev 3 nvmlDev 3 busId b000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53844 [3] NCCL INFO Init timings - ncclCommInitRankConfig: rank 3 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.06, topo 1.10, graphs 0.03, connections 0.27, rest 0.03)
40983aab902f:53830:53845 [4] NCCL INFO ncclCommInitRankConfig comm 0x563cb8a41de0 rank 4 nranks 8 cudaDev 4 nvmlDev 4 busId 84000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53845 [4] NCCL INFO Init timings - ncclCommInitRankConfig: rank 4 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.05, topo 1.11, graphs 0.03, connections 0.29, rest 0.02)
40983aab902f:53830:53843 [2] NCCL INFO ncclCommInitRankConfig comm 0x563cb87dc830 rank 2 nranks 8 cudaDev 2 nvmlDev 2 busId a000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53843 [2] NCCL INFO Init timings - ncclCommInitRankConfig: rank 2 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.07, topo 1.11, graphs 0.01, connections 0.28, rest 0.02)
40983aab902f:53830:53848 [7] NCCL INFO ncclCommInitRankConfig comm 0x563cb8dd9b20 rank 7 nranks 8 cudaDev 7 nvmlDev 7 busId 8b000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53848 [7] NCCL INFO Init timings - ncclCommInitRankConfig: rank 7 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.07, topo 1.11, graphs 0.02, connections 0.28, rest 0.02)
40983aab902f:53830:53847 [6] NCCL INFO ncclCommInitRankConfig comm 0x563cb8ca7160 rank 6 nranks 8 cudaDev 6 nvmlDev 6 busId 8a000 commId 0xfa356df9cba89de0 - Init COMPLETE
40983aab902f:53830:53847 [6] NCCL INFO Init timings - ncclCommInitRankConfig: rank 6 nranks 8 total 3.03 (kernels 1.22, alloc 0.24, bootstrap 0.08, allgathers 0.05, topo 1.11, graphs 0.03, connections 0.31, rest 0.01)
40983aab902f:53830:53830 [7] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [6] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [5] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [4] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [3] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [2] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [1] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53830 [0] NCCL INFO Symmetric VA size=80GB
40983aab902f:53830:53868 [5] NCCL INFO Channel 00/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 00/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 01/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 02/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 03/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 01/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 04/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 02/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 00/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 03/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 05/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 01/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 00/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 06/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 02/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 00/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 00/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 04/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 07/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 01/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 01/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 05/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 03/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 01/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 08/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 02/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 02/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 06/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 04/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 03/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 05/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 02/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 03/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 07/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 00/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 04/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 09/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 06/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 08/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 03/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 10/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 04/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 04/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 01/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 05/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 05/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 09/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 07/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 11/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 06/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 10/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 12/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 06/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 05/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 08/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 07/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 02/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 11/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 13/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 07/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 06/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 09/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 03/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 12/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 08/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 08/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 14/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 10/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 13/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 09/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 09/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 04/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 07/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 11/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 15/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 08/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 10/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 05/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 12/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 14/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 09/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 11/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 10/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 16/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 15/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 06/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 12/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 06/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 10/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 16/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 13/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 13/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 07/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 11/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 17/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 11/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 07/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 14/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 12/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 17/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 18/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 14/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 08/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 15/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 08/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 13/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 19/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 12/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 09/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 09/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 14/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 18/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 15/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 20/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 10/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 16/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 15/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 13/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 16/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 11/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 17/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 10/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 14/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 21/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 12/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 16/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 19/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 17/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 22/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 18/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 20/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 11/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 18/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53868 [5] NCCL INFO Channel 23/0 : 5[5] -> 6[6] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 13/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 21/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 17/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 15/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 19/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 19/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 12/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 14/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 22/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 20/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 16/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 20/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 13/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 18/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53867 [6] NCCL INFO Channel 23/0 : 6[6] -> 7[7] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 15/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 16/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 14/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 21/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 17/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 21/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 19/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 18/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 22/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 17/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 15/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 20/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53871 [2] NCCL INFO Channel 23/0 : 2[2] -> 3[3] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 19/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 18/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 21/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 22/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 22/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 19/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 20/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53869 [4] NCCL INFO Channel 23/0 : 4[4] -> 5[5] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 16/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53866 [7] NCCL INFO Channel 23/0 : 7[7] -> 0[0] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 21/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 20/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 17/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 21/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 22/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Channel 23/0 : 1[1] -> 2[2] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 18/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 22/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53870 [3] NCCL INFO Channel 23/0 : 3[3] -> 4[4] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 19/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 20/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 21/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 22/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53873 [0] NCCL INFO Channel 23/0 : 0[0] -> 1[1] via P2P/direct pointer
40983aab902f:53830:53872 [1] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53869 [4] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53871 [2] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53866 [7] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53867 [6] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53870 [3] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53868 [5] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53873 [0] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
40983aab902f:53830:53830 [0] NCCL INFO comm 0x563cb8577020 rank 0 nranks 8 cudaDev 0 busId 4000 - Destroy COMPLETE
40983aab902f:53830:53830 [7] NCCL INFO comm 0x563cb8dd9b20 rank 7 nranks 8 cudaDev 7 busId 8b000 - Destroy COMPLETE
40983aab902f:53830:53830 [6] NCCL INFO comm 0x563cb8ca7160 rank 6 nranks 8 cudaDev 6 busId 8a000 - Destroy COMPLETE
40983aab902f:53830:53830 [5] NCCL INFO comm 0x563cb8b747a0 rank 5 nranks 8 cudaDev 5 busId 85000 - Destroy COMPLETE
40983aab902f:53830:53830 [4] NCCL INFO comm 0x563cb8a41de0 rank 4 nranks 8 cudaDev 4 busId 84000 - Destroy COMPLETE
40983aab902f:53830:53830 [3] NCCL INFO comm 0x563cb890f420 rank 3 nranks 8 cudaDev 3 busId b000 - Destroy COMPLETE
40983aab902f:53830:53830 [2] NCCL INFO comm 0x563cb87dc830 rank 2 nranks 8 cudaDev 2 busId a000 - Destroy COMPLETE
40983aab902f:53830:53830 [1] NCCL INFO comm 0x563cb86a9c40 rank 1 nranks 8 cudaDev 1 busId 5000 - Destroy COMPLETE
40983aab902f:53830:53830 [7] NCCL INFO ENV/Plugin: Closing env plugin ncclEnvDefault
Steps to Reproduce the Issue
Results:
NCCL 29:
NCCL 28:
Latency with NCCL 29 seems to be ~10 lower. Is this expected?
NCCL Version
29.7+cuda12.3
Your platform details
Simple 8xH100 machine.
Error Message & Behavior
Latency regressed by ~10%