1 22 60

MC

Dreamer312

Dreamer

AI & ML interests

NLP, CV, LLM, AGENT, RL

Recent Activity

upvoted a paper about 1 month ago

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

upvoted a paper about 1 month ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

upvoted a paper about 2 months ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

View all activity

Organizations

None yet

upvoted 2 papers about 1 month ago

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 629

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 245

upvoted 2 papers about 2 months ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published Mar 29 • 146

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published Mar 22 • 77

upvoted 2 papers 4 months ago

Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 103

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 180

upvoted a paper 12 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20, 2025 • 78

upvoted a collection 12 months ago

Llama 4

Collection

Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 28 days ago • 57

upvoted 2 papers about 1 year ago

SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimization

Paper • 2505.12346 • Published May 18, 2025 • 19

Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

Paper • 2409.10262 • Published Sep 16, 2024 • 1

upvoted an article about 1 year ago

Article

Mixture of Experts Explained

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.13k

upvoted a collection about 1 year ago

Qwen3

Collection

84 items • Updated Dec 31, 2025 • 1.79k

upvoted 3 articles about 1 year ago

Article

Proximal Policy Optimization (PPO)

ThomasSimonini

•

Aug 5, 2022

• 85

Article

Merge Large Language Models with mergekit

mlabonne

•

Jan 9, 2024

• 155

Article

Trace & Evaluate your Agent with Arize Phoenix

schavalii, jgilhuly16, m-ric

•

Feb 28, 2025

• 41

upvoted an article over 1 year ago

Article

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

open-r1

•

Jan 31, 2025

• 51

upvoted a paper over 1 year ago

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

Paper • 2404.13013 • Published Apr 19, 2024 • 31

upvoted 3 articles over 1 year ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

neuralink, lvwerra, thomwolf

•

Aug 14, 2024

• 76

Article

TGI Multi-LoRA: Deploy Once, Serve 30 Models

derek-thomas, dmaniloff, drbh

•

Jul 18, 2024

• 63

Article

Preference Optimization for Vision Language Models

qgallouedec, vwxyzjn, merve, kashif

•

Jul 10, 2024

• 93

MC

AI & ML interests

Recent Activity

Organizations

Dreamer312's activity

Mixture of Experts Explained

Proximal Policy Optimization (PPO)

Merge Large Language Models with mergekit

Trace & Evaluate your Agent with Arize Phoenix

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

A failed experiment: Infini-Attention, and why we should keep trying?

TGI Multi-LoRA: Deploy Once, Serve 30 Models

Preference Optimization for Vision Language Models