Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs Paper • 2605.09063 • Published 17 days ago • 78
WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation Paper • 2603.16871 • Published Mar 17 • 61
Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math Paper • 2602.06291 • Published Feb 6 • 24
VLA-0: Building State-of-the-Art VLAs with Zero Modification Paper • 2510.13054 • Published Oct 15, 2025 • 16
VLA-0: Building State-of-the-Art VLAs with Zero Modification Paper • 2510.13054 • Published Oct 15, 2025 • 16 • 3