arxiv:2604.28123
Sudong Wang PRO
xiao45791
AI & ML interests
None yet
Recent Activity
commentedon a paper 2 days ago
Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL authored a paper 2 days ago
Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL updated a dataset 3 days ago
prism-vlm/rl_dataset