When I ran fscd_test.sh, there was a "CUDA out of memory" error , but I ran train_det.sh and train_sim.sh without any problem . My GPU is 3080. Is there any way to solve this problem? Thanks
The details of the error are as follows:
Traceback (most recent call last):
File "main.py", line 693, in <module>
evaluate(args)
File "/home/anaconda3/envs/dave/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "main.py", line 84, in evaluate
out, aux, tblr, boxes_pred = model(img, bboxes, test.image_names[ids[0].item()])
File "/home/anaconda3/envs/dave/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda3/envs/dave/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/anaconda3/envs/dave/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/code/DAVE/models/dave.py", line 442, in forward
dst_mtx = self.cosine_sim(feat_pairs[None, :], feat_pairs[:, None]).cpu().numpy()
File "/home/anaconda3/envs/dave/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda3/envs/dave/lib/python3.8/site-packages/torch/nn/modules/distance.py", line 77, in forward
return F.cosine_similarity(x1, x2, self.dim, self.eps)
RuntimeError: CUDA out of memory. Tried to allocate 5.96 GiB (GPU 2; 23.69 GiB total capacity; 12.26 GiB already allocated; 3.46 GiB free; 18.85 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
When I ran fscd_test.sh, there was a "CUDA out of memory" error , but I ran train_det.sh and train_sim.sh without any problem . My GPU is 3080. Is there any way to solve this problem? Thanks
The details of the error are as follows: