OpenSkillEval: Automatically Auditing the Open Skill Ecosystem for LLM Agents Paper • 2605.23657 • Published 9 days ago • 8
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Paper • 2605.30611 • Published 9 days ago • 189
RoboSemanticBench: Diagnosing Semantic Grounding in Action Prediction for VLA Models Paper • 2606.02277 • Published 5 days ago • 7
Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation Paper • 2605.29861 • Published 9 days ago • 16
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 18 days ago • 185
RigidFormer: Learning Rigid Dynamics using Transformers Paper • 2605.09196 • Published 28 days ago • 14
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention Paper • 2605.05838 • Published 30 days ago • 5
Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study Paper • 2605.06643 • Published 30 days ago • 4
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 506