1 4 2

Sipeng Zhang

SipengZ

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key

commentedon a paper 7 months ago

A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

upvoted a paper 7 months ago

A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

View all activity

Organizations

None yet

upvoted a paper 2 days ago

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key

Paper • 2605.06638 • Published 5 days ago • 13

commented a paper 7 months ago

A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

Paper • 2510.01427 • Published Oct 1, 2025 • 4 •

upvoted a paper 7 months ago

A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

Paper • 2510.01427 • Published Oct 1, 2025 • 4

authored 2 papers 7 months ago

MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge

Paper • 2507.21183 • Published Jul 27, 2025 • 15

A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

Paper • 2510.01427 • Published Oct 1, 2025 • 4

upvoted a paper 9 months ago

Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM's Nest

Paper • 2502.11275 • Published Feb 16, 2025 • 7

liked a model 9 months ago

KomeijiForce/Cuckoo-C4-Super-Rainbow

Token Classification • 0.4B • Updated Feb 19, 2025 • 11 • 2

upvoted a paper 10 months ago

MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge

Paper • 2507.21183 • Published Jul 27, 2025 • 15

updated a model 11 months ago

SipengZ/DPO_maxmin_Qwen7B

Updated Jun 1, 2025

published a model 12 months ago

SipengZ/Qwen2.5-7B-Instruct_v10

Updated May 17, 2025

updated a model 12 months ago

SipengZ/Qwen2.5-3B-DPO

3B • Updated May 16, 2025 • 4

published a model 12 months ago

SipengZ/Qwen2.5-3B-DPO

3B • Updated May 16, 2025 • 4

updated a model 12 months ago

SipengZ/SFT

8B • Updated May 16, 2025 • 1

published 2 models 12 months ago

SipengZ/DPO_maxmin_Qwen7B

Updated Jun 1, 2025

SipengZ/SFT

8B • Updated May 16, 2025 • 1

updated a model about 1 year ago

SipengZ/test

Updated May 9, 2025

published a model about 1 year ago

SipengZ/test

Updated May 9, 2025

liked a dataset about 1 year ago

nbalepur/UnifiedQA_MCQA2

Viewer • Updated Feb 23, 2024 • 91.2k • 13 • 2

Sipeng Zhang

AI & ML interests

Recent Activity

Organizations

SipengZ's activity