Thanks for the great work! I notice CLIP and SLIP use `all_gather_batch` and `all_gather_batch_with_grad`, respectively. What's the difference between the two? Thanks!
Thanks for the great work!
I notice CLIP and SLIP use
all_gather_batchandall_gather_batch_with_grad, respectively.What's the difference between the two?
Thanks!