Running 164 The ultimate guide to RL environments: building and scaling them in the LLM era 📝 164 Building and scaling RL environments for LLM training
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated Apr 6 • 214k • • 2.84k
brando/olympiad-bench-imo-math-boxed-825-v2-21-08-2024 Viewer • Updated Nov 6, 2024 • 1.65k • 201 • 5