Medina-Qwen3.5-27B-OpenClaw-Uncensored

An abliterated variant of peterjohannmedina/Medina-Qwen3.5-27B-OpenClaw produced via refusal-direction projection across 40 selected layers.

The base model is a Claude 4.6 Opus reasoning distillation of Qwen3.5-27B, fine-tuned on OpenClaw tool-call data. This variant removes the refusal direction from the merged weights while preserving the base's tool-calling capability and general reasoning performance.


Downloads

This repository contains the full merged BF16 weights in transformers sharded format (~54 GB). For GGUF quantizations, see the companion repo:


Abliteration Details

Parameter Value
Source Medina-Qwen3.5-27B-OpenClaw (base + LoRA, merged to BF16)
GPU NVIDIA GB10 (128 GB unified)
Method Refusal-direction projection (single orthogonal direction)
Target weights attention o_proj, linear-attention out_proj, MLP down_proj
Layer selection top 40 layers by refusal contribution (TOP_K_CAP=40)
Train / val split N_TRAIN=48 / N_VAL=20
Winsorization q=0.995
Orthogonalization r_pure = r_raw − (r_raw · c_unit) * c_unit
Refusal scope English + Korean patterns
Quantization BF16 (full precision merge)

The projected refusal direction is orthogonal to the compliance direction, so refusal behavior is removed without perturbing the compliance-related components of the weights.


Evaluation

All numbers from the Q4_K_M GGUF build running under llama.cpp with --parallel 1 --cache-reuse 0, temperature=0.0, greedy decoding. MMLU and GSM8K were run in generation mode (0-shot CoT + Answer: X), not loglikelihood.

Refusal rate — mlabonne/harmful_behaviors, test split, N=50

Model Refusals Rate
Original Medina-Qwen3.5-27B-OpenClaw 50 / 50 100.0%
This model 0 / 50 0.0%

Capability preservation — MMLU (gen, N=30 per subject) + GSM8K (N=50)

Benchmark Original Uncensored Δ
MMLU High School Computer Science 93.33% 96.67% +3.33
MMLU College Mathematics 93.33% 93.33% 0.00
MMLU Formal Logic 96.67% 96.67% 0.00
MMLU Professional Law 83.33% 76.67% −6.67
MMLU Moral Scenarios 73.33% 76.67% +3.33
MMLU Overall (150 Q) 88.00% 88.00% 0.00
GSM8K (50 Q) 98.00% 98.00% 0.00

MMLU overall and GSM8K are identical (132/150 and 49/50 respectively in both models). The only material per-subject change is Professional Law (−6.67), a known side-effect of refusal-direction projection where legal-judgment reasoning shares structural features with the refusal direction. Given N=30, the true effect size is likely smaller than it appears.


Usage with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Y0us/Medina-Qwen3.5-27B-OpenClaw-Uncensored"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [{"role": "user", "content": "Your prompt here"}]
inputs = tok.apply_chat_template(
    messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)
out = model.generate(inputs, max_new_tokens=512, do_sample=False)
print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))

Tool-calling uses the same OpenClaw XML format as the base model. See the base model card for the tool-call schema.


Known Limitations

  • Hybrid architecture — The base uses GatedDeltaNet + attention. Under current llama.cpp, KV cache cannot be reused across turns; every prompt is fully re-processed. See llama.cpp PR #13194.
  • Small eval set — N=50 refusal and N=30 per MMLU subject. This is a sanity check, not an exhaustive safety audit.
  • Professional Law drop — Measurable −6.67 on N=30; evaluate for your legal-reasoning use cases.
  • Partial coverage — The projection targets a single empirically-estimated refusal direction. Out-of-distribution prompts, multi-turn jailbreak chains, or culturally specific refusal patterns may still elicit refusals.

Intended Use

Released for research on refusal mechanisms, red-teaming and evaluation work, and capability-retention studies on abliteration. Users are responsible for downstream use and must comply with applicable laws and the upstream base-model license.


Acknowledgments


License

Apache 2.0 — same as the base model.

Downloads last month
515
Safetensors
Model size
27B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Y0us/Medina-Qwen3.5-27B-OpenClaw-Uncensored