🤝 Open to Collab
Frolov Anatolii
ssurface
·
AI & ML interests
None yet
Recent Activity
updated a collection 3 days ago
GRPO SFT-Length-Punishment-GDPO SFT GSM8K updated a model 3 days ago
ssurface/qwen3-4b-gdpo-length-sft-l5 published a model 3 days ago
ssurface/qwen3-4b-gdpo-length-sft-l5Organizations
models 32
ssurface/qwen3-4b-gdpo-length-sft-l5
Text Generation • 4B • Updated • 243
ssurface/qwen3-4b-gdpo-length-sft-l4
Text Generation • 4B • Updated • 245
ssurface/qwen3-4b-gdpo-length-sft-l3
Text Generation • 4B • Updated • 237
ssurface/qwen3-4b-gdpo-length-sft-l2
Text Generation • 4B • Updated • 237
ssurface/qwen3-4b-gdpo-length-sft-l1
Text Generation • 4B • Updated • 237
ssurface/qwen3-4b-grpo-nolength-l5
Text Generation • 4B • Updated • 16
ssurface/qwen3-4b-grpo-nolength-l4
Text Generation • 4B • Updated • 17
ssurface/qwen3-4b-grpo-nolength-l3
Text Generation • 4B • Updated • 20
ssurface/qwen3-4b-grpo-nolength-l2
Text Generation • 4B • Updated • 18
ssurface/qwen3-4b-grpo-nolength-l1
Text Generation • 4B • Updated • 13