KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs Paper β’ 2604.13226 β’ Published 7 days ago β’ 9
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data Paper β’ 2604.14164 β’ Published 29 days ago β’ 34
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration Paper β’ 2604.14116 β’ Published 6 days ago β’ 13
Exploration and Exploitation Errors Are Measurable for Language Model Agents Paper β’ 2604.13151 β’ Published 7 days ago β’ 24
From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space Paper β’ 2604.14142 β’ Published 6 days ago β’ 27
Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models Paper β’ 2601.18734 β’ Published Jan 26 β’ 5
Towards Active Synthetic Data Generation for Finetuning Language Models Paper β’ 2512.00884 β’ Published Nov 30, 2025 β’ 1
Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities Paper β’ 2501.12147 β’ Published Jan 21, 2025 β’ 1
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation Paper β’ 2402.18191 β’ Published Feb 28, 2024 β’ 1
SCAR: Efficient Instruction-Tuning for Large Language Models via Style Consistency-Aware Response Ranking Paper β’ 2406.10882 β’ Published Jun 16, 2024 β’ 2
LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning Paper β’ 2505.07437 β’ Published May 12, 2025 β’ 1
The Best Instruction-Tuning Data are Those That Fit Paper β’ 2502.04194 β’ Published Feb 6, 2025 β’ 2
BARE: Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation Paper β’ 2502.01697 β’ Published Feb 3, 2025 β’ 1
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models Paper β’ 2402.13064 β’ Published Feb 20, 2024 β’ 51
Beyond Random Sampling: Efficient Language Model Pretraining via Curriculum Learning Paper β’ 2506.11300 β’ Published Jun 12, 2025 β’ 2
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining Paper β’ 2305.10429 β’ Published May 17, 2023 β’ 5