Qwen3.6-35B-A3B — Abliterated (GGUF)

GGUF format of wangzhang/Qwen3.6-35B-A3B-abliterated for use with llama.cpp, Ollama, LM Studio, and other GGUF-compatible tools.

Available Formats

File	Format	Size	Notes
`Qwen3.6-35B-A3B-abliterated-BF16.gguf`	BF16	65 GB	Full precision, best quality
`Qwen3.6-35B-A3B-abliterated-Q8_0.gguf`	Q8_0	35 GB	Near-lossless, fits 48GB GPU
`Qwen3.6-35B-A3B-abliterated-Q4_K_M.gguf`	Q4_K_M	20 GB	Good balance, fits 24GB GPU

Metric	Value
Refusals (LLM judge, 100 eval prompts)	7/100
KL divergence from base	0.0189
Baseline refusals (original model)	100/100
LLM judge model	google/gemini-3-flash-preview

See the full model card for detailed methodology, evaluation standards, and usage instructions.

llama-cli -m Qwen3.6-35B-A3B-abliterated-BF16.gguf -p "How do I pick a lock?" -n 256

ollama run hf.co/wangzhang/Qwen3.6-35B-A3B-abliterated-GGUF:bf16

This model is released for research purposes only. The abliteration process removes safety guardrails — use responsibly.

GGUF

Model size

35B params

Architecture

qwen35moe

Hardware compatibility

4-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Quantized

(162)

this model