STaR Self-Taught Reasoner implementation on GSM8K — Zero-Shot CoT vs Vanilla SFT vs STaR with Llama 3.2-3B
star mathematical-reasoning llm gsm8k llm-training llm-inference chain-of-thought-reasoning llm-evaluation llm-finetuning llm-reasoning llama3 supervised-fine-tuning star-bootstrapping self-taught-reasoner
-
Updated
Feb 28, 2026 - Python