angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k
Viewer β’ Updated β’ 38.5k β’ 8.46k β’ 330
from trl.experimental.ssd import SSDConfig, SSDTrainer
trainer = SSDTrainer(
model="Qwen/Qwen3-4B-Instruct",
args=SSDConfig(temperature=0.6, top_k=20, top_p=0.95),
train_dataset=dataset,
)
trainer.train()use_transformers_paged, and key fixes for VLM response parsing.