The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence Paper • 2605.26494 • Published 23 days ago • 41
The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence Paper • 2605.26494 • Published 23 days ago • 41
MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling Paper • 2606.13473 • Published 7 days ago • 88
RLinf-USER: A Unified and Extensible System for Real-World Online Policy Learning in Embodied AI Paper • 2602.07837 • Published Feb 8 • 57
RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation Paper • 2509.15965 • Published Sep 19, 2025 • 19
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published Feb 4 • 100
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published Feb 4 • 100
WideSeek-R1 Collection WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning • 5 items • Updated Mar 13