Pengyu Cheng's picture

Pengyu Cheng

Linear95

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

GD^2PO: Mitigating Multi-Reward Conflicts via Group-Dynamic reward-Decoupled Policy Optimization

upvoted a paper 9 days ago

MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination

upvoted a paper 9 days ago

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

View all activity

Organizations

New activity in Quark-LLM/SSP 6 months ago

Add task category and improve dataset card

#3 opened 6 months ago by

docs: update readme

#2 opened 6 months ago by

New activity in Quark-LLM/SSP 7 months ago

feat: upload training and evaluation data

#1 opened 7 months ago by