NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation Paper • 2606.03159 • Published 2 days ago • 14
Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs Paper • 2605.30501 • Published 7 days ago • 28
NITP: Next Implicit Token Prediction for LLM Pre-training Paper • 2605.24956 • Published 11 days ago • 31
Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration? Paper • 2606.01247 • Published 4 days ago • 25
Joint Agent Memory and Exploration Learning via Novelty Signals Paper • 2606.01528 • Published 3 days ago • 12
StreamChar: Long-Horizon Streaming Character Audio-Video Generation with Decoupled Orchestration Paper • 2605.25659 • Published 10 days ago • 14
SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories Paper • 2606.01311 • Published 4 days ago • 27
Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism Paper • 2606.00408 • Published 6 days ago • 53
RoboStressBench: Benchmarking VLM Robustness to Physical Visual Stress in Embodied Scenes Paper • 2606.00828 • Published 5 days ago • 9
Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism Paper • 2605.30852 • Published 6 days ago • 9
Skill is Not One-Size-Fits-All: Model-Aware Skill Alignment for LLM Agents Paper • 2605.30723 • Published 6 days ago • 13
OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents Paper • 2606.02031 • Published 3 days ago • 15
When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs Paper • 2605.24202 • Published 13 days ago • 14
X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding Paper • 2606.02482 • Published 3 days ago • 30
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization Paper • 2606.02564 • Published 3 days ago • 25
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Paper • 2605.30611 • Published 7 days ago • 170
Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models Paper • 2605.28132 • Published 8 days ago • 22
CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation Paper • 2605.25378 • Published 10 days ago • 58
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources Paper • 2605.29250 • Published 7 days ago • 75