VaaWIT: Visual-Aware Adaptation of Large Language Models for Multilingual Web Image Translation Paper • 2605.24675 • Published 9 days ago • 4
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 5 days ago • 404
OpenComputer: Verifiable Software Worlds for Computer-Use Agents Paper • 2605.19769 • Published 13 days ago • 81
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps Paper • 2605.16928 • Published 16 days ago • 93
Sensor2Sensor: Cross-Embodiment Sensor Conversion for Autonomous Driving Paper • 2605.22809 • Published 11 days ago • 27
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 18 days ago • 145
Physics-R1: An Audited Olympiad Corpus and Recipe for Visual Physics Reasoning Paper • 2605.14040 • Published 19 days ago • 5
The Extrapolation Cliff in On-Policy Distillation of Near-Deterministic Structured Outputs Paper • 2605.08737 • Published 23 days ago • 3
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Paper • 2604.28196 • Published Apr 30 • 72
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper • 2604.18486 • Published Apr 20 • 95
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242