SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models
Paper
• 2511.15605
• Published • 25
VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified
Rewards in World Simulators
Paper
• 2510.00406
• Published • 68
Paper
• 2605.03269
• Published • 95
MolmoAct2: Action Reasoning Models for Real-world Deployment
Paper
• 2605.02881
• Published • 261
VLA-RAIL: A Real-Time Asynchronous Inference Linker for VLA Models and Robots
Paper
• 2512.24673
• Published
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots
Paper
• 2602.00919
• Published • 324
Towards Long-Lived Robots: Continual Learning VLA Models via Reinforcement Fine-Tuning
Paper
• 2602.10503
• Published
MMaDA-VLA: Large Diffusion Vision-Language-Action Model with Unified Multi-Modal Instruction and Generation
Paper
• 2603.25406
• Published • 5
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
Paper
• 2511.17889
• Published • 5
How Fast Can I Run My VLA? Demystifying VLA Inference Performance with VLA-Perf
Paper
• 2602.18397
• Published
Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos
Paper
• 2512.13080
• Published • 17
Long-Term Memory for VLA-based Agents in Open-World Task Execution
Paper
• 2604.15671
• Published
Interleave-VLA: Enhancing Robot Manipulation with Interleaved Image-Text Instructions
Paper
• 2505.02152
• Published
TIC-VLA: A Think-in-Control Vision-Language-Action Model for Robot Navigation in Dynamic Environments
Paper
• 2602.02459
• Published • 4
When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models
Paper
• 2511.21192
• Published
Towards Accessible Physical AI: LoRA-Based Fine-Tuning of VLA Models for Real-World Robot Control
Paper
• 2512.11921
• Published
A Dual Process VLA: Efficient Robotic Manipulation Leveraging VLM
Paper
• 2410.15549
• Published
End-to-End Dexterous Arm-Hand VLA Policies via Shared Autonomy: VR
Teleoperation Augmented by Autonomous Hand VLA Policy for Efficient Data
Collection
Paper
• 2511.00139
• Published • 1
LLaDA-VLA: Vision Language Diffusion Action Models
Paper
• 2509.06932
• Published
HiMoE-VLA: Hierarchical Mixture-of-Experts for Generalist Vision-Language-Action Policies
Paper
• 2512.05693
• Published • 1
On-the-Fly VLA Adaptation via Test-Time Reinforcement Learning
Paper
• 2601.06748
• Published
TrackVLA++: Unleashing Reasoning and Memory Capabilities in VLA Models for Embodied Visual Tracking
Paper
• 2510.07134
• Published
World-Gymnast: Training Robots with Reinforcement Learning in a World Model
Paper
• 2602.02454
• Published
Butter-Bench: Evaluating LLM Controlled Robots for Practical
Intelligence
Paper
• 2510.21860
• Published
RoboGene: Boosting VLA Pre-training via Diversity-Driven Agentic Framework for Real-World Task Generation
Paper
• 2602.16444
• Published
WoVR: World Models as Reliable Simulators for Post-Training VLA Policies with RL
Paper
• 2602.13977
• Published • 2
Efficient Vision-Language-Action Models for Embodied Manipulation: A Systematic Survey
Paper
• 2510.17111
• Published
Pure Vision Language Action (VLA) Models: A Comprehensive Survey
Paper
• 2509.19012
• Published
VLA-0: Building State-of-the-Art VLAs with Zero Modification
Paper
• 2510.13054
• Published • 16
10 Open Challenges Steering the Future of Vision-Language-Action Models
Paper
• 2511.05936
• Published • 6
Evo-0: Vision-Language-Action Model with Implicit Spatial Understanding
Paper
• 2507.00416
• Published
An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges
Paper
• 2512.11362
• Published • 23
NanoVLA: Routing Decoupled Vision-Language Understanding for Nano-sized Generalist Robotic Policies
Paper
• 2510.25122
• Published
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model
Paper
• 2602.10098
• Published • 19
JEPA-VLA: Video Predictive Embedding is Needed for VLA Models
Paper
• 2602.11832
• Published