Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory Paper • 2606.06523 • Published 8 days ago • 2
AsyncWebRL: Efficient Multi-Step RL for Visual Web Agents Paper • 2606.05597 • Published 6 days ago • 3
Ψ-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues Paper • 2606.02754 • Published 8 days ago • 13
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters Paper • 2606.02437 • Published 9 days ago • 223
CorrectKLinRL/Qwen3-1.7B-Base-prlCurrentKL-eta100-forward_k3-clipLow_inf-clipHigh_inf 2B • Updated 22 days ago • 61
CorrectKLinRL/Qwen3-1.7B-Base-prlCurrentKL-eta100-forward_k3-clipLow_inf-clipHigh_inf 2B • Updated 22 days ago • 61
CorrectKLinRL/Qwen3-1.7B-Base-prlCurrentKL-eta100-reverse_k3-clipLow_inf-clipHigh_inf 2B • Updated 22 days ago • 19
CorrectKLinRL/Qwen3-1.7B-Base-prlCurrentKL-eta100-reverse_k3-clipLow_inf-clipHigh_inf 2B • Updated 22 days ago • 19