Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
SISReL
PRO
SISReL
Follow
0 followers
·
1 following
https://sisrel.kaist.ac.kr
sisrel
AI & ML interests
None yet
Recent Activity
updated
a model
5 days ago
SISReL/math-RLRTcorrect-RLSDincorrect-Qwen3-4B-Base-1.0
updated
a model
5 days ago
SISReL/math-RLRTcorrect-RLSDincorrect-Qwen3-8B-Base-1.0
published
a model
8 days ago
SISReL/math-RLRTcorrect-RLSDincorrect-Qwen3-8B-Base-1.0
View all activity
Organizations
SISReL
's models
44
Sort: Recently updated
SISReL/math-RLRTcorrect-RLSDincorrect-Qwen3-4B-Base-1.0
Updated
5 days ago
•
1
SISReL/math-RLRTcorrect-RLSDincorrect-Qwen3-8B-Base-1.0
Updated
5 days ago
SISReL/math-ReverseRLSD-CorrectOnly-Qwen3-8B-nothink-0.5-decay-30
Updated
8 days ago
SISReL/math-RLSD-reprompt-Qwen3-4B-Base
Updated
8 days ago
SISReL/math-RLSD-csfooter-Qwen3-4B-Base
Updated
8 days ago
SISReL/code-SDPO-DeepSeek-R1-Distill-Qwen-7B-Think-Off-lcb-v5-train-v6-eval
Updated
10 days ago
SISReL/math-SDPO-template2-DeepSeek
Updated
11 days ago
SISReL/math-SDPO-DeepSeek-ref-think-tag-remove
Updated
11 days ago
SISReL/math-SDPO-DeepSeek-R1-Distill-Qwen-ref
Updated
11 days ago
SISReL/math-GRPO-DeepSeek
Updated
11 days ago
SISReL/math-GRPO-DeepSeek-Qwen-7B-len8192-2gpu
Updated
11 days ago
SISReL/math-SDPO-DeepSeek-Qwen-7B-new-reprompt-2
Updated
11 days ago
SISReL/math-SDPO-DeepSeek-Qwen-7B-len8192-2gpu-simple-prompt
Updated
11 days ago
SISReL/math-SDPO-DeepSeek-Qwen-7B-len8192-2gpu
Updated
11 days ago
SISReL/math-SDPO-DeepSeek-Qwen-7B-len2048-2gpu-simple-prompt
Updated
11 days ago
SISReL/math-SDPO-DeepSeek-Qwen-7B-len2048-2gpu
Updated
11 days ago
SISReL/math-GRPO-DeepSeek-Qwen-7B-len4096-2gpu
Updated
11 days ago
SISReL/math-GRPO-DeepSeek-Qwen-7B-len2048-2gpu
Updated
11 days ago
SISReL/math-GRPO-Qwen3-1.7B
Updated
12 days ago
SISReL/math-SDPO-Qwen3-1.7B
Updated
12 days ago
SISReL/DeepSeek-R1-Distill-Qwen-7B-solution-guided
Text Generation
•
333k
•
Updated
12 days ago
•
27
SISReL/math-SDPO-DeepSeek-Qwen-7B-len4096-simple-prompt
Updated
15 days ago
SISReL/math-GRPO-DeepSeek-Qwen-7B-len4096
Updated
15 days ago
SISReL/math-GRPO-DeepSeek-Qwen-7B-token-suppression
Updated
15 days ago
SISReL/GRPO-Qwen-Qwen3-4B-dapo_math
Updated
May 8
SISReL/math-SRPO-Qwen3-8B-think-off
Updated
May 5
SISReL/math-RLRT-CorrectOnly-Qwen3-4B-Base-0.5
Updated
May 5
SISReL/math-RLSD-Qwen3-8B-nothink
Updated
May 5
SISReL/math-RLRT-CorrectOnly-Qwen3-4B-Base
Updated
May 4
SISReL/math-RLSD-Qwen3-4B-Base
Updated
May 4
Previous
1
2
Next