RyanYr/pg_sais-dapo_shuffled-offline-grpo_qwen2.5-math-1.5B_kl_bl0_matheval Updated 19 days ago • 146
RyanYr/pg_sais-dapo_shuffled-offline-grpo_qwen2.5-math-1.5B_kl_bl0_matheval Updated 19 days ago • 146
RyanYr/pg-dapo_shuffled-01_offline-grpo_qwen2.5-math-1.5B_piref_kl_matheval Updated 19 days ago • 162
RyanYr/pg-dapo_shuffled-01_offline-grpo_qwen2.5-math-1.5B_piref_kl_matheval Updated 19 days ago • 162
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_piref_nokl_matheval Viewer • Updated 19 days ago • 1.55k • 36
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_piref_nokl_matheval Viewer • Updated 19 days ago • 1.55k • 36