Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 5B • Updated Apr 6 • 8.99k • 29
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated Apr 6 • 246k • 2.84k
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons NormalUhr • Feb 4, 2025 • 35
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper • 2502.07374 • Published Feb 11, 2025 • 40