This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip
-
lllyx/Qwen3-1.7B-SFT
Text Generation • 2B • Updated • 471 • • 3 -
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
Paper • 2604.13016 • Published • 108 -
lllyx/Qwen3-4B-Base-GRPO
Text Generation • 4B • Updated • 319 • • 3 -
lllyx/OpenThought3-Qwen3-4B
Viewer • Updated • 305k • 151 • 2