Skip to content

Fix MXFP4 triton kernel launch on triton>=3.x (seed=None -> int)#9

Open
kylefoxaustin wants to merge 1 commit into
IST-DASLab:mainfrom
kylefoxaustin:fix-mxfp4-triton-seed-none-int
Open

Fix MXFP4 triton kernel launch on triton>=3.x (seed=None -> int)#9
kylefoxaustin wants to merge 1 commit into
IST-DASLab:mainfrom
kylefoxaustin:fix-mxfp4-triton-seed-none-int

Conversation

@kylefoxaustin
Copy link
Copy Markdown

Problem

Pseudo-quantization training with QuestMXFP4Quantizer (the config in main_setup.sh) crashes at the first forward pass on recent triton:

TypeError: 'NoneType' object cannot be interpreted as an integer
  .../triton/backends/nvidia/driver.py", line 713, in __call__

Root cause

In mxfp4_forward_kernel_wrapper, seed is set to None when stochastic_round is False:

if stochastic_round:
    seed = randint(0, 1000000)
else:
    seed = None          # passed to the kernel's `seed: int` (non-constexpr) arg

The kernel declares seed: int (non-constexpr), so triton tries to pack it into the launch arguments. Older triton tolerated an unused None; triton >= 3.x rejects it at launch time. Since seed is only read inside the if stochastic_round: branch (dead-code-eliminated when False), the value is never used in this path.

Fix

Pass 0 instead of None — behavior-preserving (the value is unread when stochastic_round=False).

Testing

With this one-line change, the full recipe (QuestMXFP4 weights+activations + AlbertTsengQuantizer 4-bit gradients + the Q(E)Q(Wt)t_Q(Et)Q(Xt)t backward scheme) trains end-to-end — verified on triton 3.6 / torch 2.11+cu128 / RTX 5090 (sm_120), 30M Llama on WikiText-103, FP4 val-loss tracking the BF16 baseline to within ~0.07 nats.

The mxfp4 forward kernel declares `seed: int` (non-constexpr). The wrapper passes
seed=None when stochastic_round is False, which triton>=3.x rejects at launch with
"'NoneType' object cannot be interpreted as an integer". seed is only read inside
the stochastic_round=True branch (dead-code-eliminated when False), so passing 0 is
safe and behavior-preserving.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant