Skip to content

[ROCm] triton: skip dequantize_nvfp4#47

Open
Apophis3158 wants to merge 1 commit into
Comfy-Org:mainfrom
Apophis3158:rocm-skip-deq_nvfp4
Open

[ROCm] triton: skip dequantize_nvfp4#47
Apophis3158 wants to merge 1 commit into
Comfy-Org:mainfrom
Apophis3158:rocm-skip-deq_nvfp4

Conversation

@Apophis3158

Copy link
Copy Markdown

AMD GPU will crash with triton backend:

[DEBUG] Backend triton selected for dequantize_per_tensor_fp8
[DEBUG] Backend triton selected for dequantize_per_tensor_fp8
[DEBUG] Backend triton selected for dequantize_nvfp4
error: couldn't allocate input reg for constraint 'r'

Add a way to skip dequantize_nvfp4 on ROCm.

Tested with flux 2 klein, successfully fallback to eager:

[DEBUG] Backend triton selected for dequantize_per_tensor_fp8
[DEBUG] Backend eager selected for dequantize_nvfp4
[DEBUG] Backend eager selected for dequantize_nvfp4
[DEBUG] Backend eager selected for dequantize_nvfp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant