🏝️ On Vacation

Kai Ruan

6cf

15 34 226

x66ccff

AI & ML interests

AI for Science

Recent Activity

liked a model about 9 hours ago

bottlecapai/ThinkingCap-Qwen3.6-27B

liked a model 1 day ago

GnLOLot/MiniCPM5-1B-Claude-Opus-Fable5-Thinking-GGUF

reacted to danielhanchen's post with 🤗 2 days ago

DeepSeek-V4 can now run locally with Unsloth GGUFs! 🐳 Run lossless DeepSeek-V4-Flash on 168GB RAM or 3-bit works on 110GB Mac, RAM, VRAM setups. Run via Unsloth Studio or llama.cpp. GGUF: https://huggingface.co/unsloth/DeepSeek-V4-Flash-GGUF Guide: https://unsloth.ai/docs/models/deepseek-v4

View all activity

Organizations

liked a model about 9 hours ago

bottlecapai/ThinkingCap-Qwen3.6-27B

Image-Text-to-Text • 27B • Updated about 12 hours ago • 3.7k • • 212

liked a model 1 day ago

GnLOLot/MiniCPM5-1B-Claude-Opus-Fable5-Thinking-GGUF

Text Generation • 1B • Updated 1 day ago • 9.03k • 157

reacted to danielhanchen's post with 🤗👍🚀🔥 2 days ago

Post

5709

DeepSeek-V4 can now run locally with Unsloth GGUFs! 🐳

Run lossless DeepSeek-V4-Flash on 168GB RAM or
3-bit works on 110GB Mac, RAM, VRAM setups.

Run via Unsloth Studio or llama.cpp.

GGUF: unsloth/DeepSeek-V4-Flash-GGUF
Guide: https://unsloth.ai/docs/models/deepseek-v4

upvoted an article 2 days ago

Article

A brief history of distillation in AI

sergiopaniego

•

7 days ago

• 2

replied to sergiopaniego's post 2 days ago

very helpful blog, thank you 🤗

reacted to sergiopaniego's post with ❤️👍🔥 2 days ago

Post

7319

Frontier models use distillation as a step of their post-training pipelines.

In 2026 it has three jobs: compress a big model into a small one, merge RL experts into a single model, and let a model teach itself.

I wrote up which frontier models use each one and how: https://huggingface.co/blog/sergiopaniego/distillation-2026

It pairs with Class 2 of the Training an Agent series Ben and I are doing, where we teach these techniques hands-on with TRL!